Open FrancescoCaracciolo opened 11 months ago
I tried to implement this, it is real, but I noticed that if you send a message and generate to the end, or generate by token, the speed of writing messages is very different, maybe I am wrong, but a normal message was written 30 seconds, and generate by token, about 8 minutes, maybe I did not use the function correctly.
https://streamable.com/e/v5q7lx?autoplay=1 But it looks really good! If it works speed-wise normally, I'm all for it!
The problem is with bai chat or gtk updating the message?
Well I'm talking about the local model, I haven't tried it with baichat, because I don't know if it's possible to realize it with baichat.
In being possible it should be possible, I don't know with the current module. But we also have other language model services that support it already.
You might want to try with others, at least to check if it is a GTK thing, a thing of the model or a problem with the implementation
Hmm. It'd be cool to do that, actually. There is no problem with Gtk in this.
You might want to try it with Poe. Here is an example of how to do that.
Since some services support it, it would be a nice thing to add, specially for local language models that generate text in a pretty slow way