Enable response streaming

artmoskvin commented 9 months ago

Currently, the AI class generates only batch completions. So we have to wait until the whole completion is generates until we can send it back to user. A common way to improve UX is to stream generated tokens as soon as they are returned from the model. Let's add a new stream method in the AI class that calls BaseChatModel.stream(...) -> Iterator[BaseMessageChunk] from langchain under the hood.

devbuddytest[bot] commented 9 months ago

To enable response streaming in the AI class, we'll follow these steps:

Understand the Current Implementation:
- The AI class currently has a call method that takes a list of BaseMessage objects and returns a string. This method internally calls the model (an instance of BaseChatModel) with the messages and returns the content.
Identify Required Changes:
- We need to add a new method in the AI class that utilizes the stream method from BaseChatModel to stream generated tokens as they are returned from the model.
Implementation Plan:
- Add a new method named stream in the AI class.
- This method should accept the same parameters as the call method, i.e., a list of BaseMessage objects.
- Inside the stream method, use the stream method of the model attribute (which is an instance of BaseChatModel) to yield tokens as they are generated.
- Ensure that the stream method in BaseChatModel returns an iterator of BaseMessageChunk. We will yield these chunks from our new stream method in the AI class.
Testing:
- Write unit tests for the new stream method to ensure it correctly streams the response from the model.
- Test the integration of the new streaming functionality in a scenario where it's expected to improve user experience, ensuring that tokens are streamed as expected.
Documentation:
- Update the documentation of the AI class to include the new stream method, explaining its purpose, parameters, and return type.
Review and Merge:
- Review the changes with the team, addressing any feedback.
- Once approved, merge the changes into the main branch.

This plan aims to enhance the user experience by providing immediate feedback from the AI model as it generates responses. Let's discuss this plan with the team for approval before proceeding with the implementation.

artmoskvin commented 9 months ago

@devbuddytest the plan is approved, go for it!

devbuddytest[bot] commented 9 months ago

The implementation for enabling response streaming in the AI class has been completed, and a pull request has been created: Enable Response Streaming in AI Class.

This pull request introduces a new stream method in the AI class, enabling the streaming of generated tokens as soon as they are returned from the model. This enhancement aims to improve the user experience by providing immediate feedback from the AI model.

Added a new stream method in the AI class that utilizes the stream method from BaseChatModel to yield tokens as they are generated.
Imported necessary types (Iterator and BaseMessageChunk) for the implementation.

Please review the pull request and merge it if everything looks good.

artmoskvin / autocoder

Enable response streaming #32