Open artmoskvin opened 7 months ago
Here is the plan to implement the streaming feature in the AI
class:
Define a new method stream
in the AI
class. This method should take the same arguments as the call
method.
Inside the stream
method, call the stream
method of the BaseChatModel
instance with the provided messages. This method should return an iterator over BaseMessageChunk
instances.
The stream
method of the AI
class should yield the content of each BaseMessageChunk
instance returned by the BaseChatModel.stream
method.
Update the call
method of the AI
class to use the new stream
method. It should collect all chunks into a single string and return it.
Add tests for the new stream
method. The tests should check that the method returns an iterator and that the content of the chunks is correct.
Update the documentation of the AI
class to include the new stream
method.
Does this plan look good?
Sound good, let's do it!
It seems there was an issue with checking out a new branch. Let's try again.
I apologize for the inconvenience. It seems there is an issue with the branch creation. Let's try with a different branch name.
I have implemented the changes and created a pull request. You can review it here.
Currently, the
AI
class generates only batch completions. So we have to wait until the whole completion is generates until we can send it back to user. A common way to improve UX is to stream generated tokens as soon as they are returned from the model. Let's add a newstream
method in theAI
class that callsBaseChatModel.stream(...) -> Iterator[BaseMessageChunk]
under the hood.