Improved handling for streaming content and tool calls at the same time

willbakst commented 4 months ago

Description

Currently, we would do something like this:

from typing import Generator

from mirascope.openai import (
    OpenAICall,
    OpenAICallParams,
    OpenAICallResponseChunk,
    OpenAIToolStream,
)

def print_book(title: str, author: str, description: str):
    """Prints the title and author of a book."""
    return f"Title: {title}\nAuthor: {author}\nDescription: {description}"

class BookRecommender(OpenAICall):
    prompt_template = "Please recommend some books to read."

    call_params = OpenAICallParams(tools=[print_book])

def regenerate(
    chunk: OpenAICallResponseChunk,
    stream: Generator[OpenAICallResponseChunk, None, None],
) -> Generator[OpenAICallResponseChunk, None, None]:
    yield chunk
    for chunk in stream:
        yield chunk

stream = BookRecommender().stream()
first_chunk = next(stream)
generator = regenerate(first_chunk, stream)
if (
    first_chunk.delta.content is None
):  # note content property is always string, but delta.content can be None
    tool_stream = OpenAIToolStream.from_stream(generator)
    for tool in tool_stream:
        if tool:
            output = tool.fn(**tool.args)
            print(output)
    # > Title: The Name of the Wind\nAuthor: Patrick Rothfuss\nDescription: ...
    # > Title: Dune\nAuthor: Frank Herbert\nDescription: ...
else:
    for chunk in generator:
        print(chunk.content, end="", flush=True)
    # > I'd be happy to recommend some books! Here are a few options: ...

Instead, we should look into moving the regenerator into a more convenient single line function imported from mirascope.

willbakst commented 4 months ago

This spawned originally from #254

willbakst commented 4 months ago

@off6atomic I'm laughing because I just ran into this myself in a separate side project. The above solution isn't terrible, but we can definitely clean this up.

off6atomic commented 4 months ago

I just created a more complex example that involves streaming, chat history, and agent behavior (repeat using tools until done). Might be useful for making it into a streaming agent example.

This code creates an agent that has access to multiply and sqrt functions. It uses these tools, append response (either text or tool type) to the chat history, and repeat generating tool calls until it finds the final answer.

I also created a slight abstraction around your regenerate function by creating a function called stream_with_chunk_type. Another abstraction is the create_tool_message. I'm not sure if this is the right abstraction though. But if we haven't yet found the right abstraction, I think we can ship the function to the user for them to inspect and modify. The idea is similar to shadcn.

import pprint
from typing import Any, Generator

from mirascope.openai import (
    OpenAICall,
    OpenAICallParams,
    OpenAICallResponseChunk,
    OpenAIToolStream,
)
from openai.types.chat import (
    ChatCompletionMessageParam,
    ChatCompletionMessageToolCall,
    ChatCompletionToolMessageParam,
)

def regenerate(
    chunk: OpenAICallResponseChunk,
    stream: Generator[OpenAICallResponseChunk, None, None],
) -> Generator[OpenAICallResponseChunk, None, None]:
    yield chunk
    for chunk in stream:
        yield chunk

def stream_with_chunk_type(call: OpenAICall):
    """Call OpenAI and return the generator and the type of each chunk from the generator."""
    stream = call.stream()
    first_chunk = next(stream)
    generator = regenerate(first_chunk, stream)
    is_text = first_chunk.delta.content is not None
    chunk_type = "text" if is_text else "tool"
    return generator, chunk_type

def create_tool_message(
    tool_call: ChatCompletionMessageToolCall, output: Any
) -> ChatCompletionToolMessageParam:
    return {
        "role": "tool",
        "content": str(output),
        "tool_call_id": tool_call.id,
        "name": tool_call.function.name,
    }

def multiply(a: float, b: float) -> float:
    """Multiplies two numbers."""
    return a * b

def sqrt(a: float) -> float:
    """Square root of a number."""
    return a**0.5

class Mathematician(OpenAICall):
    prompt_template = """
    SYSTEM: You are a mathematician. Help the user with their math problem.
    MESSAGES: {history}
    USER: {question}
    """

    history: list[ChatCompletionMessageParam] = []
    question: str = ""
    call_params = OpenAICallParams(model="gpt-3.5-turbo", tools=[multiply, sqrt])

def call_mathematician(mathematician: Mathematician):
    """Streams the Mathematician's response to stdout. The response can be tool
    calls or text tokens.
    This function doesn't mutate the Mathematician object's state, but it returns
    a list of messages that can be used to update the object's chat history.

    The only side effects of this function is calling OpenAI API, calling tools,
    and printing the response to stdout.
    """
    print("Executing question:", repr(mathematician.question))
    generator, chunk_type = stream_with_chunk_type(mathematician)
    print("Streaming response with type:", chunk_type)

    messages = []  # prepare messages to be appended to the chat history
    if mathematician.question:
        messages.append({"role": "user", "content": mathematician.question})

    if chunk_type == "tool":
        tool_calls = []
        tool_output_messages = []
        tool_stream = OpenAIToolStream.from_stream(generator)
        for tool in tool_stream:
            if tool:
                print("Calling tool:", tool.tool_call.function)
                output = tool.fn(**tool.args)
                print("Tool output:", output)
                tool_calls.append(tool.tool_call.model_dump())
                tool_output_messages.append(create_tool_message(tool.tool_call, output))

        messages.append(
            {
                "role": "assistant",
                "tool_calls": tool_calls,
            }
        )
        messages += tool_output_messages

    else:  # text stream
        chunks = []
        print("Response: ", end="", flush=True)
        for chunk in generator:
            print(chunk.content, end="", flush=True)
            chunks.append(chunk.content)
        print()

        messages.append(
            {
                "role": "assistant",
                "content": "".join(chunks),
            }
        )

    print("Messages:")
    pprint.pp(messages)
    print()

    return messages

mathematician = Mathematician(question="What is sqrt(1674) x sqrt(6931) x sqrt(1234) ?")
# The correct answer is 119655.663

messages = call_mathematician(mathematician)
mathematician.history += messages

# if the last message is a tool, the chatbot needs to keep going
while mathematician.history[-1]["role"] == "tool":
    mathematician.question = ""
    messages = call_mathematician(mathematician)
    mathematician.history += messages

mathematician.question = "Explain how you arrived at the answer in detail."
messages = call_mathematician(mathematician)
mathematician.history += messages

Here is the output:

Executing question: 'What is sqrt(1674) x sqrt(6931) x sqrt(1234) ?'
Streaming response with type: tool
Calling tool: Function(arguments='{"a": 1674}', name='Sqrt')
Tool output: 40.91454509095757
Calling tool: Function(arguments='{"a": 6931}', name='Sqrt')
Tool output: 83.25262758616091
Calling tool: Function(arguments='{"a": 1234}', name='Sqrt')
Tool output: 35.12833614050059
Messages:
[ {'role': 'user', 'content': 'What is sqrt(1674) x sqrt(6931) x sqrt(1234) ?'},
  { 'role': 'assistant',
    'tool_calls': [ { 'id': 'call_L4Q0HfGYc6dXyHYsIyHltoxn',
                      'function': {'arguments': '{"a": 1674}', 'name': 'Sqrt'},
                      'type': 'function'},
                    { 'id': 'call_77UKaj9CXJr2lw91PSmcyNLP',
                      'function': {'arguments': '{"a": 6931}', 'name': 'Sqrt'},
                      'type': 'function'},
                    { 'id': 'call_e3tqv0wG2ydrpDS8caq0EXil',
                      'function': {'arguments': '{"a": 1234}', 'name': 'Sqrt'},
                      'type': 'function'}]},
  { 'role': 'tool',
    'content': '40.91454509095757',
    'tool_call_id': 'call_L4Q0HfGYc6dXyHYsIyHltoxn',
    'name': 'Sqrt'},
  { 'role': 'tool',
    'content': '83.25262758616091',
    'tool_call_id': 'call_77UKaj9CXJr2lw91PSmcyNLP',
    'name': 'Sqrt'},
  { 'role': 'tool',
    'content': '35.12833614050059',
    'tool_call_id': 'call_e3tqv0wG2ydrpDS8caq0EXil',
    'name': 'Sqrt'}]

Executing question: ''
Streaming response with type: tool
Calling tool: Function(arguments='{"a":40.91454509095757,"b":83.25262758616091}', name='Multiply')
Tool output: 3406.2433853146786
Messages:
[ { 'role': 'assistant',
    'tool_calls': [ { 'id': 'call_arAaqMBcthgnGxzuZfj2R8qF',
                      'function': { 'arguments': '{"a":40.91454509095757,"b":83.25262758616091}',
                                    'name': 'Multiply'},
                      'type': 'function'}]},
  { 'role': 'tool',
    'content': '3406.2433853146786',
    'tool_call_id': 'call_arAaqMBcthgnGxzuZfj2R8qF',
    'name': 'Multiply'}]

Executing question: ''
Streaming response with type: tool
Calling tool: Function(arguments='{"a":3406.2433853146786,"b":35.12833614050059}', name='Multiply')
Tool output: 119655.66261569072
Messages:
[ { 'role': 'assistant',
    'tool_calls': [ { 'id': 'call_CkwUZJkzWkbihHSHNLr1VXDi',
                      'function': { 'arguments': '{"a":3406.2433853146786,"b":35.12833614050059}',
                                    'name': 'Multiply'},
                      'type': 'function'}]},
  { 'role': 'tool',
    'content': '119655.66261569072',
    'tool_call_id': 'call_CkwUZJkzWkbihHSHNLr1VXDi',
    'name': 'Multiply'}]

Executing question: ''
Streaming response with type: text
Response: The result of \( \sqrt{1674} \times \sqrt{6931} \times \sqrt{1234} \) is approximately 119655.66
Messages:
[ { 'role': 'assistant',
    'content': 'The result of \\( \\sqrt{1674} \\times \\sqrt{6931} \\times '
               '\\sqrt{1234} \\) is approximately 119655.66'}]

Executing question: 'Explain how you arrived at the answer in detail.'
Streaming response with type: text
Response: To find the result of \( \sqrt{1674} \times \sqrt{6931} \times \sqrt{1234} \), we first calculated the square roots of each of the given numbers individually:
1. \( \sqrt{1674} \) is approximately 40.9145
2. \( \sqrt{6931} \) is approximately 83.2526
3. \( \sqrt{1234} \) is approximately 35.1283

Next, we multiplied these square roots together:
\( 40.9145 \times 83.2526 \times 35.1283 \approx 119655.66 \)

Therefore, the result of \( \sqrt{1674} \times \sqrt{6931} \times \sqrt{1234} \) is approximately 119655.66.
Messages:
[ { 'role': 'user',
    'content': 'Explain how you arrived at the answer in detail.'},
  { 'role': 'assistant',
    'content': 'To find the result of \\( \\sqrt{1674} \\times \\sqrt{6931} '
               '\\times \\sqrt{1234} \\), we first calculated the square roots '
               'of each of the given numbers individually:\n'
               '1. \\( \\sqrt{1674} \\) is approximately 40.9145\n'
               '2. \\( \\sqrt{6931} \\) is approximately 83.2526\n'
               '3. \\( \\sqrt{1234} \\) is approximately 35.1283\n'
               '\n'
               'Next, we multiplied these square roots together:\n'
               '\\( 40.9145 \\times 83.2526 \\times 35.1283 \\approx 119655.66 '
               '\\)\n'
               '\n'
               'Therefore, the result of \\( \\sqrt{1674} \\times \\sqrt{6931} '
               '\\times \\sqrt{1234} \\) is approximately 119655.66.'}]

willbakst commented 4 months ago

This is awesome! Definitely want to include this as an example once we implement the improvements here and for inserting tool calls :)

Once we have those merged in, would you be up for submitting a PR with this example post cleanup?

Happy to include it myself if you'd prefer as part of the other changes.

willbakst commented 4 months ago

Another thing worth thinking about here:

The call_mathematician function could be included as a method of the Mathematician class so that you can call something like mathematician.help(problem="...") which may make it more reusable elsewhere in your codebase and take advantage of colocation and versioning.

off6atomic commented 4 months ago

@willbakst I'm happy for you to create a PR. Just tag me in it. That's all.

About moving the function to be included as method, that's good too.

Mirascope / mirascope

Improved handling for streaming content and tool calls at the same time #262

Description