google / jetstream-pytorch

PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"
Apache License 2.0
33 stars 14 forks source link

Make prefilling return first token for loadgen integration #143

Closed sixiang-google closed 2 months ago

sixiang-google commented 2 months ago

R: @vipannalla