Open artmoskvin opened 4 months ago
To implement the feature of enabling response streaming, we will modify the AI
class in the autocoder/ai.py
file to add a new method stream
that utilizes the BaseChatModel.stream(...)
method from the langchain
library. Here's a plan for the implementation:
Import necessary types and functions:
Iterator
is imported from typing
if not already.BaseMessageChunk
is imported from langchain.schema
or add the import if necessary.Add the stream
method to the AI
class:
messages: List[BaseMessage]
as input.self.model.stream(messages)
method.stream
call, which should be an iterator over BaseMessageChunk
instances.Update any necessary documentation or comments to reflect the new functionality.
Test the new stream
method to ensure it works as expected with the BaseChatModel.stream(...)
method.
Here's a rough sketch of how the new method might look:
from typing import List, Iterator
from langchain.schema import BaseMessage, BaseMessageChunk
class AI:
def __init__(self, model: BaseChatModel):
self.model = model
def call(self, messages: List[BaseMessage]) -> str:
print_system_msg(f"Calling AI with prompt:\n{pprint_messages(messages)}")
return self.model(messages).content
def stream(self, messages: List[BaseMessage]) -> Iterator[BaseMessageChunk]:
print_system_msg(f"Streaming AI response for prompt:\n{pprint_messages(messages)}")
return self.model.stream(messages)
This plan introduces a new method to the AI
class that leverages the streaming capability of the underlying chat model. It's designed to improve user experience by providing immediate feedback as the AI generates responses.
Let's discuss this plan and make any necessary adjustments before proceeding with the implementation.
@devbuddytest great plan! let's do it!
The implementation for enabling response streaming has been completed. A new branch named add-stream-method
was created, and the necessary changes were made to the AI
class in the autocoder/ai.py
file. A pull request has been opened for review: Add stream method to AI class #29.
Currently, the AI class generates only batch completions. So we have to wait until the whole completion is generates until we can send it back to user. A common way to improve UX is to stream generated tokens as soon as they are returned from the model. Let's add a new stream method in the AI class that calls BaseChatModel.stream(...) -> Iterator[BaseMessageChunk] from langchain under the hood.