Closed tengomucho closed 2 days ago
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.
I think I got lost in your changes: can you summarize how tests are now supposed to work ?
@dacorvo as discussed offline, the idea is to change the default backend of TPU TGI from torch xla to jetstream. I just updated the tests so they use clearer markers to check if the backend they are running is correctly selected.
What does this PR do?
This makes all the changes to allow having the Jetstream Pytorch engine to be the default backend for TGI on TPUs. This backend is reliable and performant and give the best throughput on TGI.