elixir-nx / bumblebee

Pre-trained Neural Network models in Axon (+ 🤗 Models integration)
Apache License 2.0
1.27k stars 90 forks source link

Remove conversational serving #308

Closed jonatanklosko closed 6 months ago

jonatanklosko commented 6 months ago

For conversation pipeline hf/transformers moved from tokenizer-specific prompt implementation to configurable templates (https://github.com/huggingface/transformers/pull/25323) and since the templates use Jinja, we can't reasonably use them (at least right now). So for now users need to use the text generation serving with the right prompt themselves. FWIW the serving only supported a few specific models (for which we had the templating logic), so I don't think it's been used much.