Enable response streaming

artmoskvin commented 7 months ago

Currently, the AI class generates only batch completions. So we have to wait until the whole completion is generates until we can send it back to user. A common way to improve UX is to stream generated tokens as soon as they are returned from the model. Let's add a new stream method in the AI class that calls BaseChatModel.stream(...) -> Iterator[BaseMessageChunk] under the hood.

devbuddytest[bot] commented 7 months ago

Here is the plan to implement the streaming feature in the AI class:

Define a new method stream in the AI class. This method should take the same arguments as the call method.
Inside the stream method, call the stream method of the BaseChatModel instance with the provided messages. This method should return an iterator over BaseMessageChunk instances.
The stream method of the AI class should yield the content of each BaseMessageChunk instance returned by the BaseChatModel.stream method.
Update the call method of the AI class to use the new stream method. It should collect all chunks into a single string and return it.
Add tests for the new stream method. The tests should check that the method returns an iterator and that the content of the chunks is correct.
Update the documentation of the AI class to include the new stream method.

Does this plan look good?

artmoskvin commented 7 months ago

Sound good, let's do it!

devbuddytest[bot] commented 7 months ago

It seems there was an issue with checking out a new branch. Let's try again.

devbuddytest[bot] commented 7 months ago

I apologize for the inconvenience. It seems there is an issue with the branch creation. Let's try with a different branch name.

devbuddytest[bot] commented 7 months ago

I have implemented the changes and created a pull request. You can review it here.

artmoskvin / autocoder

Enable response streaming #24