Closed keyboardAnt closed 1 year ago
I started to draft a solution. @minimaxir, please let me know if you have any feedback. Thanks!
This one of the things I needed to test: it's possible the title
and related fields is silently ignored on OpenAI's end, as they use a mysterious post-processing of the schema.
The way to test if that's the case is to see if before/after the fix the prompt_token
counts is changed.
The fix would then to be to just prevent title
from being returned in the schema. There are hacks at the class-level for that, which may not necessitate the code in the draft.
Even if they implemented some parsing, it is not guaranteed what and how it is done, so I find it safer to follow the format introduced in their examples.
Would be interesting to follow https://github.com/hwchase17/langchain/issues/6933
Added a fix upcoming with the Pydantic 2.0 fixes. The simple solution is to just remove the title
field recursively from the schema which doesn't have much client overhead and I've tested it works. The items
field is necessary for nested models and does follow the spec.
The LangChain issue you linked is more specific to LangChain.
None of the OpenAI examples use lists. items
come from the JSON Schema that OpenAI references in their documentation.
The TTRPG notebook demonstrates that items
do indeed work.
The simple solution is to just remove the
title
field recursively from the schema which doesn't have much client overhead and I've tested it works.
Wouldn't replacing "title" with "name"—as in OpenAI's examples—be safer?
name
is only used at the top-level likely for the function_call
param, which is already the case with the current implementation in simpelaichat. name
is otherwise not a part of the JSON Schema spec.
name
is only used at the top-level likely for thefunction_call
param, which is already the case with the current implementation in simpelaichat.
Seems like write_ttrpg_story.schema()["properties"]
haven't been unchanged and returns the same output mentioned in the first message, with a title
entry:
{
"events": {
"title": "Events",
"description": "All events in a TTRPG campaign.",
"type": "array",
"items": {
"$ref": "#/definitions/Event"
}
}
}
It is fixed in https://github.com/minimaxir/simpleaichat/commit/4c891b28ba81ecafc3ebf35d52088f446feb843f when the input goes into ChatGPT model since it's not easy to change at the Pydantic level, and I've verified the write_ttrpg_story
does correctly strip the title
field recursively then.
I get that OpenAI has minimal documentation but as long as complicated examples work that's best we can do.
Our current implementation for building schemas differs from the approach used in OpenAI's examples (see the "Basic concepts" section). Notably, our schema properties include a "title" field and potentially "items".
For example, in examples/notebooks/schema_ttrpg.ipynb,
Event.schema()["properties"]
yields:and
write_ttrpg_story.schema()["properties"]
is: