elixir-nx / bumblebee

Pre-trained Neural Network models in Axon (+ 🤗 Models integration)
Apache License 2.0
1.27k stars 90 forks source link

Support non-deterministic output in text generation serving #284

Closed jonatanklosko closed 6 months ago

jonatanklosko commented 7 months ago

Currently servings that use random numbers accept a :seed option when building the serving, but users of LLMs actually expect each call to give a different reply. We can accept seed as part of serving input and pass that to generation.

The default behaviour should likely be that we actually always use a different seed. With that, we no longer need the :seed option on the serving.