openai / openai-python

The official Python library for the OpenAI API
https://pypi.org/project/openai/
Apache License 2.0
21.93k stars 3.02k forks source link

Fix typing errors from mypy when calling completion #911

Closed achrome closed 9 months ago

achrome commented 9 months ago

Confirm this is an issue with the Python library and not an underlying OpenAI API

Describe the bug

Mypy doesn't seem to play very well with the type annotations that the SDK has generated. I can see that OpenAI uses Stainless and that might be the issue in itself, but as it stands the type annotations aren't very well documented.

See below for the exact use case and how to reproduce.

I'd propose either updating the annotations to be more semantically correct, or updating the documentation and usage if the idea is to keep the annotations more strict.

Option 1 seems like a better option until a full migration to stricter types can be reached, likely in a major version.

To Reproduce

When calling completion like so

client = AsyncOpenAI(api_key=<API_KEY>)
messages = [
    {"role": "user", "content": "some test message"}
]
completion = await client.chat.completions.create(
    model="gpt-3.5-turbo-0613", # Actual model doesn't matter here
    messages=messages # Mypy shows a type mismatch error here
)

I did a little bit of digging and found that the messages arg in chat.completions.create is annotated as ChatCompletionMessageParam, however the method is smart enough to take and serialize plain dicts as well. I haven't seen the rest of the endpoints, but my guess is that this annotation issue is present throughout.

I can get around it by using the appropriate types instead. So, the following works

from openai.types.chat import ChatCompletionMessageParam, ChatCompletionUserMessageParam
messages: list[ChatCompletionMessageParam] = [
    ChatCompletionUserMessageParam(role="user", content="some test message")
]
# This seems to satisfy mypy

Code snippets

No response

OS

Linux

Python version

3.12

Library version

v1.3.6

RobertCraigie commented 9 months ago

This is an unfortunate limitation of type inferring in the current state of python's static types.

If you don't inline the messages param you need to give it an explicit type

from openai.types.chat import ChatCompletionMessageParam

messages: list[ChatCompletionMessageParam] = [
    {
        "role": "user",
        "content": "Say this is a test",
    },
]

You don't need to actually instantiate the type like in your example unless you really want to.

RobertCraigie commented 9 months ago

but as it stands the type annotations aren't very well documented.

What more documentation would you want to see?

I'd propose either updating the annotations to be more semantically correct

What do you mean by semantically correct? If we updated the type annotations to accept any abritrary dictionary then all of our types would be essentially meaningless.

nathan-chappell commented 8 months ago

@RobertCraigie Python's typing might be limiting to some extent, but the fact is that you guys automatically generated your client library with some tool that did a pretty good, but not perfect, job.

Let's suppose I want to use the openai types to create a list of messages. Up until the new library I was using a definition like so:

class MessageDict(TypedDict):
    role: Literal["system", "user", "assistant"]
    content: str

It doesn't handle every possible way to access the api, but it's been good enough (usually if I'm doing function calling there is a lot more ceremony involved anyways).

I would like to use such a class as follows:

completion = await client.chat.completions.create(
    messages=[MessageDict(role=r, content=c) for r,c in some_dict.items()],
    ...
)

As of right now the type checkers don't like this. Fine, how can I do it with the types you've provided us?

class ChatCompletionFunctionMessageParam(TypedDict, total=False):
    content: Required[Optional[str]]
    name: Required[str]
    role: Required[Literal["function"]]

class ChatCompletionToolMessageParam(TypedDict, total=False):
    content: Required[Optional[str]]
    role: Required[Literal["tool"]]
    tool_call_id: Required[str]

class ChatCompletionAssistantMessageParam(TypedDict, total=False):
    content: Required[Optional[str]]
    role: Required[Literal["assistant"]]
    function_call: FunctionCall
    tool_calls: List[ChatCompletionMessageToolCallParam]

class ChatCompletionUserMessageParam(TypedDict, total=False):
    content: Required[Union[str, List[ChatCompletionContentPartParam], None]]
    role: Required[Literal["user"]]

class ChatCompletionSystemMessageParam(TypedDict, total=False):
    content: Required[Optional[str]]
    role: Required[Literal["system"]]

# openai.types.chat

ChatCompletionMessageParam = Union[
    ChatCompletionSystemMessageParam,
    ChatCompletionUserMessageParam,
    ChatCompletionAssistantMessageParam,
    ChatCompletionToolMessageParam,
    ChatCompletionFunctionMessageParam,
]

# Since this is a union type an not a base, it can't be used to create dicts like all the other types here can.

Listen man, this is an ugly mess. It's pretty good for automatically generated, and I'd certainly be willing to deal if it was somewhat convenient, but no man, it's not:

What's funny is that the MessageDict could essentially serve as a base class to all those other TypedDict classes, but the Union type is generated, so it can't be used for creating objects.

TLDR: whatever

The new client is a huge improvement on that old one, it almost feels like modern python. But it's not perfect, so please don't act like it is. Oh yea, and the documentation I would like to see is an OpenAI python-programmer using the types from the new library effectively.

antont commented 8 months ago

@nathan-chappell Hm I don't quite get this.

Am doing:


messages: list[ChatCompletionMessageParam] #I get this as param to a function and then add more messages to it

messages.extend([
    system_message("You are a helpful assistant."),
    user_message(question)
])

completion = await client.chat.completions.create(
    model="gpt-4",
    messages=messages
)

where am using a couple factory funcs to have the message instance creation more simple and robust

def system_message(content: str):
    return ChatCompletionSystemMessageParam(
        content=content,
        role="system"
    )

def user_message(content: str):
    return ChatCompletionUserMessageParam(
        content=content,
        role="user"
    )

And at least in VSCode (with I guess with pyright?) the type checker is happy.

Is this perhaps a limitation of mypy? I think the author mentioned that in some discussion here.

Or did I miss something? Only gave a quick read so far, and figured that might be useful to share what am doing.

RobertCraigie commented 8 months ago

Your first method snippet will not work regardless of what our types are, it's entirely dependant on how you've typed some_dict.

Listen man, this is an ugly mess. It's pretty good for automatically generated, and I'd certainly be willing to deal if it was somewhat convenient, but no man, it's not:

I'm not sure where you're coming from here - why do you want to explicitly instantiate an object for the params?

All of this goes away if you in-line the dictionary (assuming some_dict is typed properly):

completion = await client.chat.completions.create(
    messages=[{'role': r, 'content': 'c'} for r,c in some_dict.items()],
    ...
)

What's funny is that the MessageDict could essentially serve as a base class to all those other TypedDict classes, but the Union type is generated, so it can't be used for creating objects.

Are you suggesting that we update the params type to just be your MessageDict type? That will just result in less type-safety, as it would have to include all of the properties from the actual message types and let you represent states that are not valid.

So even if we extracted some properties into a common base class the params type should still be a Union so that it represents the actual types that the API expects. Which then makes the MessageDict base class entirely moot because you wouldn't be able to assign it to our union type.

Then additionally, using that as a base class is potentially unsound which is why Pyright will report an error in strict mode for this definition:

class BaseMessageParam(TypedDict):
    role: Literal["system", "user", "assistant"]
    content: str

class AssistantMessageParam(BaseMessageParam):
    role: Literal['assistant']
    # ...

# Type of TypedDict field "role" cannot be redefined
#    Type in parent class is "Literal['system', 'user', 'assistant']" and type in child class is "Literal['assistant']"

I found your comment a bit rude; some parts did not contribute to a productive discussion.

nathan-chappell commented 8 months ago

@nathan-chappell Hm I don't quite get this.

Am doing:

messages: list[ChatCompletionMessageParam] #I get this as param to a function and then add more messages to it

messages.extend([
    system_message("You are a helpful assistant."),
    user_message(question)
])

completion = await client.chat.completions.create(
    model="gpt-4",
    messages=messages
)

where am using a couple factory funcs to have the message instance creation more simple and robust

def system_message(content: str):
    return ChatCompletionSystemMessageParam(
        content=content,
        role="system"
    )

def user_message(content: str):
    return ChatCompletionUserMessageParam(
        content=content,
        role="user"
    )

And at least in VSCode (with I guess with pyright?) the type checker is happy.

Is this perhaps a limitation of mypy? I think the author mentioned that in some discussion here.

Or did I miss something? Only gave a quick read so far, and figured that might be useful to share what am doing.

This is what I refer to as make a bunch of aliases (somewhat ugly, adds complexity). See, the thing is that probably everyone would like to make these factory functions, but if that's the case then why isn't it just provided by the client library? Also, why would you have to provide the role parameter to the class ChatCompletionSystemMessageParam? There is one reasonable choice, and it should be filled in automatically if I'm putting System in the damn class name. But let's look at the class definition:

class ChatCompletionSystemMessageParam(TypedDict, total=False):
    content: Required[Optional[str]]
    role: Required[Literal["system"]]

Okay, can't put a default value in there because it's a TypedDict. At this point as library writers we have two options: say screw it, user's problem, or come up with a better design. I don't know if this is some sort of impossible thing to do because the API is too overloaded, but right now the current design is not tuned towards convenience.

nathan-chappell commented 8 months ago

@RobertCraigie I apologize for being rude. I was being intentionally confrontational, and I would rather proceed in a cordial manner.

Your first method snippet will not work regardless of what our types are, it's entirely dependant on how you've typed some_dict

Fair enough, but in general you are missing my point. Basically, there are times when we can in our own code guarantee that a some variable is an appropriate role, but then getting this information into your api is difficult. For example:

tuples: list[tuple[Literal["user", "assistant", "system"], str]] = []

# type checker not happy
completion = await client.chat.completions.create(
    messages=[{"role": r, "content": c} for r, c in tuples],
    model="foobar",
)

# invalid - can't construct with Union type
completion = await client.chat.completions.create(
    messages=[ChatCompletionMessageParam(role=r, content=c) for r, c in tuples],
    model="foobar",
)

# I hope you aren't serious...
completion = await client.chat.completions.create(
    messages=[
        (
            ChatCompletionUserMessageParam(role=r, content=c)
            if r == "user"
            else ChatCompletionAssistantMessageParam(role=r, content=c) if r == "assistant" else ChatCompletionSystemMessageParam(role="system", content=c)
        )
        for r, c in tuples
    ],
    model="foobar",
)

So, first of all, I want to check if you are defending the current implementation and feel that it is a good design that shouldn't be improved. If that's not the case, then what I'm telling you is that there are issues in the design that make seemingly obvious things difficult. Blaming python doesn't really inspire confidence in your python programmers who actually try to solve such design problems.

Again, it's a huge improvement, but there are a number of small issues that hinder the usability of the library that would likely be avoided if the API endpoints weren't so heavily overloaded or you guys weren't trying to automatically generate the whole client. Do I want you guys to just use MessageDict? No, that's two second easy solution. But it might demonstrate what your target for usability could be.

nathan-chappell commented 8 months ago

@RobertCraigie I don't want to be rude here either, I would just like to put out there that I found your initial response to the first commenter a bit dismissive. If you are a person of authority, than you being dismissive will always stifle discussion more than someone without it being confrontational. This very page and discussion is my case in point.

antont commented 8 months ago

@nathan-chappell

class ChatCompletionSystemMessageParam(TypedDict, total=False):
    content: Required[Optional[str]]
    role: Required[Literal["system"]]

Okay, can't put a default value in there because it's a TypedDict

Ah, I didn't even know that.

Maybe the message types should be Pydantic models, like the API responses already are. They can be instantiated quite nicely like ChatCompletionUserMessageParam(content=text) and have extra validation such as string lengths and number ranges etc. which I guess could be useful.

OpenAI folks want to support passing just dicts as the messages, but I guess the functions could accept both the correct Pydantic type, and a dict that'd be then used to create an instance of that type, and thus be validated at that stage. Maybe this change could be done in a backwards compatible way, I'm not sure.

antont commented 8 months ago

@nathan-chappell

That's actually what I would do, and I think completely sensible if for some reason you have the roles as strings.

tuples: list[tuple[Literal["user", "assistant", "system"], str]] = []
(...)
# I hope you aren't serious...
    messages=[
        (
            ChatCompletionUserMessageParam(role=r, content=c)
            if r == "user"
            else ChatCompletionAssistantMessageParam(role=r, content=c) if r == "assistant" else ChatCompletionSystemMessageParam(role="system", content=c)
        )
        for r, c in tuples
    ],

I'd probably use a dict to map the strings to classes. Well, actually I do have something like this in our system, just from a different role enum that we use internally.

role2message = {
  'user': ChatCompletionUserMessageParam,
  'assistant': ChatCompletionAssistantMessageParam
  'system': ChatCompletionSystemMessageParam
}

messages=[role2message[role](content=content, role=role)
          for role, content in tuples]

I don't find this unreasonable when you have strings that define which type should be used.

nathan-chappell commented 8 months ago

@antont I like all of the solutions you've proposed, and would be happy to see something like that packaged with the client. I really want the openai client to be good enough to prevent the proliferation of useless convenience tools like we've seen this "ai spring."

RobertCraigie commented 8 months ago
tuples: list[tuple[Literal["user", "assistant", "system"], str]] = []
completion = await client.chat.completions.create(
    messages=[{"role": r, "content": c} for r, c in tuples],
    model="foobar",
)

This is a pretty compelling example and I am surprised that Pyright doesn't infer the right type here.

It seems like it works when the first entry in the tuple is typed as just Literal['user'] but breaks down when it's expanded to include other values, e.g. Literal['user', 'system'].

I'd appreciate it if someone here could open an issue / discussion on the Pyright repository with a minimal reproduction of this case to see if this behaviour could be improved on their end, I won't have time to do so until later today.

So, first of all, I want to check if you are defending the current implementation and feel that it is a good design that shouldn't be improved. If that's not the case, then what I'm telling you is that there are issues in the design that make seemingly obvious things difficult. Blaming python doesn't really inspire confidence in your python programmers who actually try to solve such design problems.

I do not believe the SDK is completely perfect, I am sure there are other edge cases where behaviour can be improved. We spend a lot of time trying to design the most idiomatic SDK possible so your examples here are appreciated, we were not aware of this problem.

I don't want to be rude here, I would just like to put out there that I found your initial response to the first commenter a bit dismissive. If you are a person of authority, than you being dismissive will always stifle discussion more than someone without it being confrontational

I appreciate your candidness, can you share why you feel that my initial response was dismissive? That was not my intention. I was under the impression that the initial comment was coming from a misunderstanding of type inferring in python, not that there are real situations where typing is very cumbersome which you have pointed out.

nathan-chappell commented 8 months ago

@RobertCraigie If you say you weren't being dismissive then I believe you and I apologize. I'm probably a bit biased, I'll try to remain more professional with you guys - I appreciate your response.

If I have time I'll look into if Pyright has an intention of addressing this, but I wouldn't be surprised if that [https://github.com/erictraut](Eric Traut) has big plans already, they are really pushing the type system (and python itself it seems) pretty hard. If I had to guess this would need to be part of some sort of principled expansion of dependent-type handling, I don't know if there will be a quick fix for this particular scenario.

RobertCraigie commented 8 months ago

Thank you, your change in tone is very much appreciated :)

I don't know if there will be a quick fix for this particular scenario.

I agree, there's no guarantee that this is a quick fix but I have seen Eric make changes incredibly quickly so I'm holding out hope.

For what it's worth, this snippet should work for you and while it is a little verbose, I think it's much better than having to import lots of types or define your own helper functions while we come up with a better solution:

messages: list[ChatCompletionMessageParam] = []
for role, content in tuples:
    if role == 'user':
        messages.append({"role": 'user', 'content': content})
    elif role == 'system':
        messages.append({"role": 'system', 'content': content})
rattrayalex commented 8 months ago

Thanks very much for sharing your examples and input @antont, and for turning this into a productive technical conversation @nathan-chappell and @RobertCraigie, highlighting some of the shortcomings of our current approach and the reasons why it's tricky to change. I also appreciate your kind words @nathan-chappell!

My personal takeaway here is that it'd probably be pretty handy for this library to provide constructors for message params, perhaps something loosely along these lines:

from openai.types.chat.completions import *

completion = await client.chat.completions.create(
    model="gpt-4",
    messages=[
        SystemMessage("You are a helpful robot"),
        UserMessage("Why is the sky blue? Some context will follow"),
        *(Message(role=r, content=c) for r,c in some_dict.items()),
        AssistantMessage(tool_calls=[…]),
        ToolMessage("<results>", tool_call_id="123")
    ],
)

My guess is that yes, these would be pydantic models, as suggested here. (Though Robert and I will take this offline to discuss details/internals)

What do you think of something like this?

nathan-chappell commented 8 months ago

Hey guys, maybe we could move this to a discussion? I tend to agree that there is not really an issue here, if I have one it is truly with Pyright, but I get the feeling that there may be a useful discussion better held elsewhere...

rattrayalex commented 8 months ago

If you have feedback on the above proposal, sharing it here would be great!

nathan-chappell commented 8 months ago

@rattrayalex I haven't thought about how this might need to be applied to different parts of the api or if there are any weird corner cases to watch out for or anything like that, but if your code was supported by the library then I would start using it immediately and have nothing else to complain about for now. Going with pydantic is great in my opinion - but I'm biased cause I'm using fastapi (I would support its use anyways).

Okay fine, unrelated to this issue, but related to conveniences I would love to see:

Here's some code to demonstrate what I mean. There are really just ideas, they are probably too opinionated and not implemented well enough to really be included in the library as is, but basically this will become boilerplate in my projects, and if someone offers it as a packaged-standardized-well-accepted-solution, I'd probably just use it.

from asyncio import run
from os import environ
from typing import Literal
from openai import OpenAI, AsyncOpenAI
from openai.types.chat.chat_completion_tool_param import ChatCompletionToolParam
from pydantic import BaseModel, Field
from rich import print

# region Boilerplate - I'll probably duplicate this code everywhere

class OpenAIOptions(BaseModel):
    organization: str | None = None
    base_url: str | None = None
    timeout: float | None = None
    max_retries: int = 2
    default_headers: dict[str, str] | None = None
    default_query: dict[str, object] | None = None

class OpenAIMaker:
    api_key: str
    default_options: OpenAIOptions

    def __init__(self, api_key: Literal["from-env"] | str, default_options: OpenAIOptions = OpenAIOptions()) -> None:
        if api_key == "from-env":
            api_key = environ["OPENAI_API_KEY"]
        self.api_key = api_key
        self.default_options = default_options

    def client(self, options: OpenAIOptions = OpenAIOptions()) -> OpenAI:
        return OpenAI(api_key=self.api_key, **self.default_options.model_copy(update=options.model_dump(exclude_unset=True)).model_dump())

    def aclient(self, options: OpenAIOptions = OpenAIOptions()) -> AsyncOpenAI:
        return AsyncOpenAI(api_key=self.api_key, **self.default_options.model_copy(update=options.model_dump(exclude_unset=True)).model_dump())

def make_tools(model_classes: list[type[BaseModel]]) -> list[ChatCompletionToolParam]:
    return [
        {
            "type": "function",
            "function": {
                "name": model_cls.__name__,
                "description": model_cls.__doc__ or "model_cls.__name__",
                "parameters": model_cls.model_json_schema(),
            },
        }
        for model_cls in model_classes
    ]

# endregion Boilerplate

# region demo - it's a little silly

class InvokeMagicSpell(BaseModel):
    """Use mana to invoke a spell"""
    spell_name: Literal["fireball", "cheeseball"] = Field(default="cheeseball", description="fireball deals 10 damage, cheesball deals 10 humiliation")

class CowerInFear(BaseModel):
    """Demonstrate your cowardice to the world"""
    crying: bool = Field(default=False, description="If you cower in fear, your enemies may show you mercy")

openai_maker = OpenAIMaker("from-env", OpenAIOptions(timeout=20.0, max_retries=3))

async def demo() -> None:
    async with openai_maker.aclient(OpenAIOptions(max_retries=5)) as client:
        response = await client.chat.completions.create(
            model="gpt-3.5-turbo-1106",
            messages=[{"role": "user", "content": """You have just been surprise attacked by a blue slime!  What's your move?\n\nJSON:"""}],
            tools=make_tools([InvokeMagicSpell, CowerInFear]),
            response_format={"type": "json_object"},
        )
        print(response)

# endregion demo

if __name__ == "__main__":
    run(demo())
ChatCompletion(
    id='chatcmpl-8XQH13DO1K0JNmnchd00mRFRsrk50',
    choices=[
        Choice(
            finish_reason='tool_calls',
            index=0,
            logprobs=None,
            message=ChatCompletionMessage(
                content='{\n  "tool_uses": [\n    {\n      "recipient_name": "functions.InvokeMagicSpell",\n      "parameters": {\n        "spell_name": "fireball"\n      }\n    },\n 
{\n      "recipient_name": "functions.CowerInFear",\n      "parameters": {\n        "crying": false\n      }\n    }\n  ]\n}',
                role='assistant',
                function_call=None,
                tool_calls=[
                    ChatCompletionMessageToolCall(
                        id='call_fGNLDHnHnshsJtks9KlwG3ms',
                        function=Function(arguments='{"spell_name": "fireball"}', name='InvokeMagicSpell'),
                        type='function'
                    ),
                    ChatCompletionMessageToolCall(id='call_AJJUWILBh7XTx2xsQZXpdYj4', function=Function(arguments='{"crying": false}', name='CowerInFear'), type='function')
                ]
            )
        )
    ],
    created=1702976511,
    model='gpt-3.5-turbo-1106',
    object='chat.completion',
    system_fingerprint='fp_772e8125bb',
    usage=CompletionUsage(completion_tokens=122, prompt_tokens=187, total_tokens=309)
)
npip99 commented 5 months ago

@rattrayalex Based on a read through the history, it seems like this issue should be open. Should we mark as open?


I think a very simple solution would be to add another type to the ChatCompletionMessageParam union.

Currently, we have

ChatCompletionMessageParam = Union[
    ChatCompletionSystemMessageParam,
    ChatCompletionUserMessageParam,
    ChatCompletionAssistantMessageParam,
    ChatCompletionToolMessageParam,
    ChatCompletionFunctionMessageParam,
]

If instead we had,

ChatCompletionMessageParam = Union[
    ChatCompletionSystemMessageParam,
    ChatCompletionUserMessageParam,
    ChatCompletionAssistantMessageParam,
    ChatCompletionToolMessageParam,
    ChatCompletionFunctionMessageParam,
    ChatCompletionGenericMessageParam,
]

Where ChatCompletionGenericMessageParam was,

class ChatCompletionGenericMessageParam(TypedDict, total=False):
    content: Required[Optional[str]]
    role: Required[Literal["system", "user", "assistant"]]
rattrayalex commented 5 months ago

This issue was originally raised about essentially a bug in our types, which isn't there.

We could open a separate issue to track an enhancement to make them more convenient, but I don't think that'll be terribly useful. You're welcome to, though.

Your suggestion here does seem like a quick win, but I believe we'd rather make a change along the lines of https://github.com/openai/openai-python/issues/911#issuecomment-1857142928 - we're blocked on getting some unrelated work done first, but do hope to get to it before too long.

npip99 commented 5 months ago

Awesome, thanks for the update.

Yeah I'll make one only so that something is tracking it or I'll forget to keep up with this chat; I'm not saying that it's important just because there's an issue open for it (It can be triaged low for sure)

gl on what you're blocked on atm

proafxin commented 4 months ago

For what it's worth, this snippet should work for you and while it is a little verbose, I think it's much better than having to import lots of types or define your own helper functions while we come up with a better solution:

messages: list[ChatCompletionMessageParam] = []
for role, content in tuples:
    if role == 'user':
        messages.append({"role": 'user', 'content': content})
    elif role == 'system':
        messages.append({"role": 'system', 'content': content})

Thank you, this was the most important reply for me in this thread. I also stumbled upon this issue and at least this is a temporary solution.

Sorry if I am reviving this or the discussion moved elsewhere but I would like to mention this. Now, according to https://peps.python.org/pep-0589/ If you notice, the following is said image

This is in compliance with what you just wrote, however, I tried the following

messages: list[ChatCompletionMessageParam] = [
        {"role": role, "content": content} for (role, content) in zip(roles, texts)
    ]

and got the error Type of TypedDict is ambiguous, none of ("ChatCompletionSystemMessageParam", "ChatCompletionUserMessageParam", "ChatCompletionAssistantMessageParam") matches cleanly And I couldn't get it fixed until I came across your reply. But that's what worries me a bit. Because what I am doing is essentially the same as you are doing. I see no valid reason why one would work and the other wouldn't.

I might be wrong but I think it would be in the best interests of the API to not make the implementation rigid like this. This may be just one example, who knows there could be more. If typing doesn't work when we use list comprehension but it works when we manually declare them, some design decision at the very root wasn't probably the best. I wouldn't presume to tell you what that is but it could be the usage of union. I would request you to reconsider. If it doesn't introduce any breaking changes and you have the capacity, please reconsider modifying it so the typing isn't ambiguous.