Add `ProviderChatMessage` e.g. `OpenAIChatMessage` types and coerce to message params under the hood

willbakst commented 1 month ago

Add ProviderChatMessage e.g. OpenAIChatMessage types and coerce to message params under the hood. If reading the OpenAI docs or coming from their SDK, users will expect to be able to insert response.message into chat history. This is because OpenAI makes that work even though their typing doesn't match that. We should just allow for this insertion and under the hood convert any output pydantic models into their input typed dict counterparts. This way users don't have to learn the difference and mirascope just makes it easy.

@jbbakst I'm feeling less sold on this having implemented the message_param given that using message_param matches the typing of the SDK (even though it technically accepts the response message type) and provides a common interface across providers where sometimes you can't pass the message in directly.

Originally posted by @willbakst in https://github.com/Mirascope/mirascope/issues/264#issuecomment-2153663959

jbbakst commented 1 month ago

to me, i feel like a major point of mirascope is to not have to care at all about what the providers accept in their own SDKs. so i'd ask what you think the experience you want your developers to have is completely irrespective of what the provider SDKs do.

obviously response.message_param is an improvement over having to do cast(OpenAIChatMessageParam, response.message.model_dump()), but i'd still argue that it puts the burden on me as your end-user to know about and understand the difference between a message and a message_param (which I frankly still don't fully understand, other than the fact that one is a pydantic model and the other a typed dict). so i'd personally prefer to just be able to throw messages around and not worry about this distinction

jbbakst commented 1 month ago

unless you can articulate a reason that you would want your end users to have to think about this distinction. (could totally imagine response.message_param still being useful as an escape hatch for when you need to deal with the provider libs directly, but would imagine that just using response.message would be a clearer default for most people)

willbakst commented 1 month ago

Consider:

from mirascope.openai import OpenAICall
from openai.types.chat import ChatCompletionMessageParam

class MyCall(OpenAICall):

    history: list[ChatCompletionMessageParam]

    def messages(self) -> list[ChatCompletionMessageParam]:
        return [*history, {"role": "user", "content": "hello"}]

my_call = MyCall()
response = my_call.call()
print(response.content)
my_call.history += [response.user_message_param, response.message_param]

This makes sense to me. When using the raw OpenAI SDK, if I try to use completion.choices[0].message I'll get a type error because it doesn't match ChatCompletionMessageParam:

And if I don't type messages: list[ChatCompletionMessageParam] I get a type error on create:

Should Mirascope make it possible to do things without type errors that would be a type error if using the raw provider SDK?

If we do something here, it would have to be strictly limited to MESSAGES injection when using a prompt_template. If you're using the prompt_template though then it looks cleaner because you're bought-in and there's no errors / it works:

from mirascope.openai import OpenAICall, OpenAIChatMessage  # this covers input and output messages

class MyCall(OpenAICall):
    prompt_template = """
    MESSAGES: {history}
    USER: hello
    """

    history: list[OpenAIChatMessage]

my_call = MyCall()
response = my_call.call()
print(response.content)
my_call.history += [response.user_message_param, response.message]

The documentation would have to be extremely clear to explain what's happening. If you're writing your own messages, you would still have to use ChatCompletionMessageParam for both as the below won't work:

from mirascope.openai import OpenAICall, OpenAIChatMessage  # this covers input and output messages
from openai.types.chat import ChatCompletionMessageParam

class MyCall(OpenAICall):
    history: list[OpenAIChatMessage]

    def messages(self) -> list[ChatCompletionMessageParam]:
        return [*history, {"role": "user", "content": "hello"}]  # this will throw type errors because history doesn't match

my_call = MyCall()
response = my_call.call()
print(response.content)
my_call.history += [response.user_message_param, response.message]

If every single provider supported inserting the wrong type into their create method, then I could justify updating the messages return type here to OpenAIChatMessage, but I'm not sure that's the case. I would have to test each provider.

If not, then you couldn't use my_call.messages() as the input to a raw provider SDK call without getting a type error or actual error.

jb-delightai commented 1 month ago

Should Mirascope make it possible to do things without type errors that would be a type error if using the raw provider SDK?

Yes, absolutely.

If I wanted to deal with the idiosyncrasies of the openai sdk, I would just use the openai sdk. I feel like the whole point here is for you to make it as clean and easy for me to use mirascope as possible, designing the best possible interface for your users completely independent of what provider sdks do, and then handle mapping that to the provider sdks under-the-hood.

willbakst commented 1 month ago

And you're ok where history will only work this way if you use prompt_template but won't be supported if you write your own messages? For the reasons above.

jb-delightai commented 1 month ago

Re: your example:

from mirascope.openai import OpenAICall
from openai.types.chat import ChatCompletionMessageParam

class MyCall(OpenAICall):

    history: list[ChatCompletionMessageParam]

    def messages(self) -> list[ChatCompletionMessageParam]:
        return [*history, {"role": "user", "content": "hello"}]

my_call = MyCall()
response = my_call.call()
print(response.content)
my_call.history += [response.user_message_param, response.message_param]

This honestly feels kinda gross to me. Some responses may follow this turn-based pattern of user message -> assistant message, but many won't. So now I need to understand that response.user_message is only present some of the time, depending on the internal structure of the call. This feels like a very leaky abstraction to me.

I don't know what the proper solution is here, but i feel like for something like this chatbot use case, a cleaner pattern would be something along the lines of:

from mirascope.openai import OpenAICall
from mirascope.openai.chat import ChatMessage, UserMessage

class Chatbot(OpenAICall):
    prompt_template = """
    SYSTEM: You are a friendly chatbot.
    MESSAGES: {history}
    USER: {_input}
    """

    history: list[ChatMessage]
    _input: str

    def chat(self, input: str) -> str:
        # you already know i don't like this pattern, but not trying to include the scope of that issue in this one
        self._input = input
        response = self.call()
        self.history += [UserMessage(content=input), response.message]
        return response.message.content

chatbot = Chatbot()
response_text = chatbot.chat(input="hello")
print(response_text)

jb-delightai commented 1 month ago

And you're ok where history will only work this way if you use prompt_template but won't be supported if you write your own messages? For the reasons above.

No. Similarly, I would expect your messages handling to essentially be ChatCompletionMessageParam | ChatMessage and handle conversion internally.

E.g. everywhere that i deal with messages in mirascope, I can just use the ChatMessage type (or ChatCompletionMessageParam as an escape hatch, in case I already have one of those for some reason), and then you handle it for me internally

willbakst commented 1 month ago

from mirascope.openai.chat import ChatMessage, UserMessage

I really don't like this. If anything it would be more like:

from mirascope.openai import OpenAICall, OpenAIChatMessage

class Chatbot(OpenAICall):
    prompt_template = """
    SYSTEM: You are a friendly chatbot.
    MESSAGES: {history}
    USER: {_input}
    """

    history: list[OpenAIChatMessage]
    _input: str

    def chat(self, input: str) -> str:
        # you already know i don't like this pattern, but not trying to include the scope of that issue in this one
        self._input = input
        response = self.call()
        self.history += [{"role": "user", "content": input}, response.message]  # since you don't like `user_message_param`
        return response.content

chatbot = Chatbot()
response_text = chatbot.chat(input="hello")
print(response_text)

Note: OpenAIChatMessage would be a composite type of TypedDict and BaseModel and thus not directly instantiable, thus requiring the response.user_message_param or self constructed dict that will get coerced.

This we could do, and we could do it for every provider because inside of the messages method we will coerce into the right type.

What I meant about only working when using prompt_template is that I'm struggling to reconcile when writing your own messages array:

from mirascope.openai import OpenAICall, OpenAIChatMessage

class Chatbot(OpenAICall):
    history: list[OpenAIChatMessage]
    _input: str

    def messages(self) -> list[OpenAIChatMessage]:  # this might break things, unknown, depends on provider?
        return [
            {"role": "system", "content": "You are a friendly chatbot"},
            *history,
            {"role": "user", "content": self._input},
        ]

    def chat(self, input: str) -> str:
        # you already know i don't like this pattern, but not trying to include the scope of that issue in this one
        self._input = input
        response = self.call()
        self.history += [{"role": "user", "content": input}, response.message]  # since you don't like `user_message_param`
        return response.content

chatbot = Chatbot()
response_text = chatbot.chat(input="hello")
print(response_text)

At this point you've taken back control and are choosing not to take advantage of certain conveniences. But since you're writing the messages yourself, the coercion is no longer happening, and we internally need the return type of messages to be ChatCompletionMessageParam to match the SDK, otherwise my_call.messages() will throw errors when used with the raw SDK, which I don't want.

Would something like the following work in the case that you're writing your own messages?

from mirascope.openai import OpenAICall, OpenAIChatMessage, convert_to_openai_message_param
from openai.types.chat import ChatCompletionMessageParam

class Chatbot(OpenAICall):
    history: list[OpenAIChatMessage]
    _input: str

    def messages(self) -> list[ChatCompletionMessageParam]:
        return [
            {"role": "system", "content": "You are a friendly chatbot"},
            *convert_to_openai_message_param(history),
            {"role": "user", "content": self._input},
        ]

    def chat(self, input: str) -> str:
        # you already know i don't like this pattern, but not trying to include the scope of that issue in this one
        self._input = input
        response = self.call()
        self.history += [{"role": "user", "content": input}, response.message]  # since you don't like `user_message_param`
        return response.content

chatbot = Chatbot()
response_text = chatbot.chat(input="hello")
print(response_text)

willbakst commented 1 month ago

Worth noting that in my head convert_to_openai_message_param would just be the function we use internally for the coercion when you don't write your own messages array.

Mirascope / mirascope

Add `ProviderChatMessage` e.g. `OpenAIChatMessage` types and coerce to message params under the hood #311