google / jetstream-pytorch

PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"
Apache License 2.0
33 stars 14 forks source link

Make Ray engine and worker process prefill returning first token #147

Closed richardsliu closed 2 months ago