Open artmoskvin opened 4 months ago
To enable response streaming in the AI class, we'll follow these steps:
Understand the Current Implementation:
AI
class currently has a call
method that takes a list of messages and returns a string. This method internally calls the model
(an instance of BaseChatModel
) with the messages and returns the content.BaseChatModel
is imported from langchain.chat_models.base
.Plan for Streaming Implementation:
AI
class that allows streaming of generated tokens. This method will utilize the stream
method of the BaseChatModel
if available.Implementation Steps:
BaseChatModel
has a stream
method or any equivalent that supports streaming. If not, we might need to extend or modify the BaseChatModel
to support streaming.AI
class, possibly named stream
, that will handle the streaming of responses. This method should be designed to call the stream
method of the BaseChatModel
and yield results as they become available.stream
method in the AI
class accepts the same type of input as the call
method, i.e., a list of BaseMessage
objects.stream
method to manage any exceptions that might occur during the streaming process.Review and Testing:
Deployment:
Let's discuss this plan with the team to get their input and approval before proceeding with the implementation.
@devbuddytest Sounds good! I can confirm that the BaseChatModel
has a stream
method that returns Iterator[BaseMessageChunk]
Great! Since the BaseChatModel
already has a stream
method that returns Iterator[BaseMessageChunk]
, we can proceed with the implementation plan as outlined, focusing on integrating this streaming capability into the AI
class. Here's the refined plan based on the confirmation:
Add Stream Method to AI Class:
stream
in the AI
class. This method will leverage the stream
method from the BaseChatModel
to yield BaseMessageChunk
objects as they are generated.Method Signature and Implementation:
stream
method in the AI
class should accept the same parameters as the existing call
method to ensure consistency.stream
method, call the stream
method of the BaseChatModel
instance, passing the necessary parameters.BaseMessageChunk
as it becomes available from the BaseChatModel.stream
method.Error Handling:
stream
method to gracefully manage any exceptions or errors that might occur during the streaming process.Unit Tests:
stream
method to ensure it behaves as expected. This includes testing for correct streaming of message chunks, handling of errors, and proper termination of the stream.Documentation:
AI
class to include information about the new stream
method, its purpose, and how to use it.Review and Testing:
Deployment:
I'll proceed with implementing this plan. Let's start by adding the stream
method to the AI
class.
@devbuddytest go for it
The pull request to enable response streaming in the AI class has been successfully created. You can review and merge the changes via the following link: Enable Response Streaming in AI Class Pull Request.
This pull request introduces a new stream
method in the AI
class, enabling the streaming of responses from the AI model. This method leverages the stream
method of the BaseChatModel
, yielding BaseMessageChunk
objects as they are generated, aimed at improving the user experience by providing real-time responses.
Currently, the AI class generates only batch completions. So we have to wait until the whole completion is generates until we can send it back to user. A common way to improve UX is to stream generated tokens as soon as they are returned from the model. Let's add a new stream method in the AI class that calls BaseChatModel.stream(...) -> Iterator[BaseMessageChunk] under the hood.