elixir-nx / bumblebee

Pre-trained Neural Network models in Axon (+ 🤗 Models integration)
Apache License 2.0
1.27k stars 90 forks source link

Text completion behavior is different when streaming vs not streaming #296

Closed brainlid closed 7 months ago

brainlid commented 7 months ago

When streaming and doing a text completion, the behavior is to return only the newly generated text. This is desirable.

When streaming is false, the completion returns the originally provided text as well. To programmatically work with this, the returned text needs to be split out so it can be treated as a true response.

If there is an option to control this, I don't know what it is. I suggest changing the default behavior for not-streaming to match when streaming and only return the newly generated text.

jonatanklosko commented 7 months ago

Duplicate of #247 :)