AI-Hypercomputer / JetStream

JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).
Apache License 2.0
202 stars 26 forks source link

Enable JetStream Standalone Server #94

Open JoeZijunZhou opened 4 months ago

JoeZijunZhou commented 2 months ago

what is the purpose of adding a standalone server?

To improve the usability of JetSteam Server. Eventually we could have a single command to run the server with different engines, different models, different configs.