Enable response streaming

artmoskvin commented 4 months ago

Currently, the AI class generates only batch completions. So we have to wait until the whole completion is generates until we can send it back to user. A common way to improve UX is to stream generated tokens as soon as they are returned from the model. Let's add a new stream method in the AI class that calls BaseChatModel.stream(...) -> Iterator[BaseMessageChunk] from langchain under the hood.

devbuddytest[bot] commented 4 months ago

To implement the feature of enabling response streaming, we will modify the AI class in the autocoder/ai.py file to add a new method stream that utilizes the BaseChatModel.stream(...) method from the langchain library. Here's a plan for the implementation:

Import necessary types and functions:
- Ensure Iterator is imported from typing if not already.
- Confirm that BaseMessageChunk is imported from langchain.schema or add the import if necessary.
Add the stream method to the AI class:
- The method will accept messages: List[BaseMessage] as input.
- It will call the self.model.stream(messages) method.
- It will yield the results from the stream call, which should be an iterator over BaseMessageChunk instances.
Update any necessary documentation or comments to reflect the new functionality.
Test the new stream method to ensure it works as expected with the BaseChatModel.stream(...) method.

Here's a rough sketch of how the new method might look:

from typing import List, Iterator
from langchain.schema import BaseMessage, BaseMessageChunk

class AI:
    def __init__(self, model: BaseChatModel):
        self.model = model

    def call(self, messages: List[BaseMessage]) -> str:
        print_system_msg(f"Calling AI with prompt:\n{pprint_messages(messages)}")
        return self.model(messages).content

    def stream(self, messages: List[BaseMessage]) -> Iterator[BaseMessageChunk]:
        print_system_msg(f"Streaming AI response for prompt:\n{pprint_messages(messages)}")
        return self.model.stream(messages)

This plan introduces a new method to the AI class that leverages the streaming capability of the underlying chat model. It's designed to improve user experience by providing immediate feedback as the AI generates responses.

Let's discuss this plan and make any necessary adjustments before proceeding with the implementation.

artmoskvin commented 4 months ago

@devbuddytest great plan! let's do it!

devbuddytest[bot] commented 4 months ago

The implementation for enabling response streaming has been completed. A new branch named add-stream-method was created, and the necessary changes were made to the AI class in the autocoder/ai.py file. A pull request has been opened for review: Add stream method to AI class #29.

artmoskvin / autocoder

Enable response streaming #28