guidance-ai / guidance

A guidance language for controlling large language models.
MIT License
18.94k stars 1.04k forks source link

Do you support `json` when using `OpenAI` models? #1052

Open robbie-daniels opened 1 week ago

robbie-daniels commented 1 week ago

Is your feature request related to a problem? Please describe. I would like to use OpenAI models to generate pydantic objects.

I tried replicating this example from OpenAI's documentation:

from guidance import models, assistant, user, system, json
from pydantic import BaseModel

class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]

gpt4 = models.OpenAI("gpt-4o-2024-08-06")

with system():
    gpt4 += "Extract the event information"

with user():
    gpt4 += "Alice and Bob are going to a science fair on Friday"

with assistant():
    gpt4 += json(name="generated_object", schema=CalendarEvent)

and get this error:

The model attempted to generate b'Event: Science' after the prompt 
`b'... Friday<|im_end|>\n<|im_start|>assistant\n'`, but that does not 
match the given grammar constraints! Since your model is a remote 
API that does not support full guidance integration we cannot force 
the model to follow the grammar, only flag an error when it fails to 
match.  You can try to address this by improving the prompt, making 
your grammar more flexible, rerunning with a non-zero temperature, 
or using a model that supports full guidance grammar constraints.

It doesn't look like it's trying to generate a json. I believe the model I used supports structured outputs. Is json just unsupported with OpenAI models for now?

Describe the solution you'd like The prompt returns: CalendarEvent(name='Science Fair', date='Friday', participants=['Alice', 'Bob'])

Describe alternatives you've considered I tried a simpler example, adding clearer instructions to user:

from guidance import models, assistant, user, system, json
from pydantic import BaseModel

class Schema(BaseModel):
    b: bool

gpt4 = models.OpenAI("gpt-4o-2024-08-06")

with system():
    gpt4 += "You generate json files."

with user():
    gpt4 += "Respond with only the contents of the json file, with no whitespace or newlines.  Do not write 'json' at the beginning of your answer.  Your json file conforms to this schema:"

with assistant():
    gpt4 += json(name="generated_object", schema=Schema)

but I get similar errors.

If my examples are incorrect, has anybody else tried this and gotten it working?

hudson-ai commented 1 week ago

Hi @robbie-daniels and thank you for the issue! Compatibility with the OpenAI API for JSON generation is definitely a planned feature, but admittedly our support for remote API models lags our support for locally hosted models at the present moment.

If you can't run a local model, we do currently support constrained decoding with Azure-hosted Phi 3.5 using the AzureGuidance model class in our latest (pre) release.

Hope to provide some updates in the near future!