Use streaming endpoint for text responses

mwinkens commented 10 months ago

Describe the feature you'd like to request

Currently the LocalAI is only using the static endpoint instead of the streaming endpoint, where you would see the answer of the "in real time". I don't know, if the OpenAI endpoint is doing this as well. It would be real nice to have similar behaviour, (e.g. answers coming word by word) having in the nextcloud assistant, too!

Describe the solution you'd like

use streaming endpoint
see text flying in word by word

Describe alternatives you've considered

forcing this fields doesn't make sense, maybe switch to "Current behavior" and "Enhanced behavior". What alternatives do you expect?

I guess the only alternative is still using the static endpoint

mwinkens commented 4 months ago

While this issue still persists, this is now way better with the new "Chat with AI" feature, good change there :+1:

urbenlegend commented 3 months ago

This poses an issue as well for timeouts. For really long responses, it sometimes takes a while to receive the entire thing. There's no feedback to the user and Nextcloud times out the request as well.

nextcloud / assistant