Closed thomasahle closed 1 month ago
It also seems types on the form int | str
are not supported, while Union[int, str]
. Note this is not an issue with Pydantic or json_schema, since these both support the types fine.
Also, if I use a recursive type, like
class Answer(BaseModel):
text: str
answers: list["Answer"]
Answer.update_forward_refs()
The openai_schema_helper
function goes into an infinite recurison.
@thomasahle I just released instructor 1.4.0, this should fix this issue. Could you try it and give it a shot? Note that for the dict[str|any]
, I tested and it seems like only JSON mode is able to generate the right response.
I feel like that's an issue with the tool calling implementation itself ( Since it might just not know how to match the type ) but will look into it later in the week when I get more time.
I ran the original code above, but got InstructorRetryException: RetryError[<Future at 0x11c0c50d0 state=finished raised ValidationError>]
.
I tried to change response_model=UserInfo
to dict[str, int]
, but got
File /opt/homebrew/lib/python3.12/site-packages/instructor/process_response.py:227, in handle_response_model(response_model, mode, **kwargs)
225 iterable_element_class = get_args(response_model)[0]
226 response_model = IterableModel(iterable_element_class)
--> 227 if not issubclass(response_model, OpenAISchema):
228 response_model = openai_schema(response_model) # type: ignore
230 if new_kwargs.get("stream", False) and not issubclass(
231 response_model, (IterableBase, PartialBase)
232 ):
File <frozen abc>:123, in __subclasscheck__(cls, subclass)
TypeError: issubclass() arg 1 must be a class
Does it work for you?
@thomasahle , if you use the original code, I found that it works nicely with the JSON
mode if you change the client as such. That should fix the RetryError. Not sure why honestly.
client = instructor.from_openai(OpenAI(), mode=instructor.Mode.JSON)
In terms of the support for dict[str,int]
, we can probably add a guard for it in the next release but it's not a great way to prompt the model for a response. Tool Calling tends to benefit more from a structured output so having something like the iterable
below or a list of User objects has consistently worked for me.
import instructor
from pydantic import BaseModel, Field
from openai import OpenAI
import dotenv
class User(BaseModel):
name: str
age: int
client = instructor.from_openai(OpenAI(), mode=instructor.Mode.TOOLS_STRICT)
# Extract structured data from natural language
users = client.chat.completions.create_iterable(
model="gpt-4o-mini",
response_model=User,
messages=[
{
"role": "system",
"content": "Please provide the name and ages of the users as a dictionary.",
},
{
"role": "user",
"content": "John Doe is 30 years old. Anne Smith is 25 years old.",
},
],
)
for user in users:
print(user)
See primitives we support at https://python.useinstructor.com/concepts/types
Closing this issue since the original problem of an unsupported dictionary field in the pydantic model was resolved
What Model are you using?
Describe the bug When I'm trying to use types, like
dict
, inside a pydantic model, I get errors fromopenai_schema_helper
.To Reproduce
Expected behavior It should print
{"John Doe": 30, "Anne Smith": 25}
Screenshots
Notes:
The Json Schema for the type in the example is:
I tried all of the following types, and they all fail with similar errors to the one above: