JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).
202
stars
26
forks
source link
Remove jax dependencies in JetStream #88
Open
FanhaiLu1 opened 4 months ago
There multiple jax code in JetStream, we should shift jax related code to engine implementation and remove jax dependencies in JetStream.
In the end, JetStream is orchestrator for Pytorch and Jax inference.