firebase / genkit

An open source framework for building AI-powered apps with familiar code-centric patterns. Genkit makes it easy to develop, integrate, and test AI features with observability and evaluations. Genkit works with various models and platforms.
Apache License 2.0
738 stars 108 forks source link

[JS] Description hallucinations when using `describe` and `nullish` methods together #538

Open ariel-pettyjohn opened 4 months ago

ariel-pettyjohn commented 4 months ago

Describe the bug Models occasionally hallucinate, about 30% of the time with the below use-case, and add a "description" field when the Zod .describe() method is used.

To Reproduce I don't know how general this behavior is, and I've only experienced it with this one use-case, so I'll add a screenshot of my code below to help reproduce. This particular application is to extract colors from text by name, RGB array, or hex value. I've tried removing the descriptions and running the flow and haven't encountered this behavior after repeated runs. The input in this case was "green, orange, [0, 46, 226], and #456". In other runs, it's also returned non-null descriptions such as "the color green".

Expected behavior Because the schema only provides fields for "name", "hex", and "rgb", I'd expect to only receive values for those fields from the model. Otherwise, the output is exactly as expected.

Screenshots Code: extract colors

Output with hallucinated descriptions: hallucinated descriptions

Runtime (please complete the following information):

** Node version

Thanks, and hopefully it isn't just user error!

ariel-pettyjohn commented 4 months ago

Ah-ha, this seems like a misunderstanding on my part, looking at the Zod documentation.

Judging from the the Genkit documentation, my expectation was that the .describe() method was being used to provide field-specific context to the model.

I'm closing accordingly, but I'm still not clear what the behavior should be in a case like this when there are multiple fields with descriptions.

Feel free to reopen if I've doubly-misunderstood and closed the issue prematurely.

ariel-pettyjohn commented 3 months ago

Alrighty, I'm going to reopen this issue because I've verified that it's still present in 1.5 Flash, and I was able to better isolate its cause thanks to the recently improved logging. Thanks again for that!

Describe the bug So, now I'm just trying to list the typical properties of an object (see first screenshot), like "users" in the example below. When adding .nullish() and .describe(...) to the propertyNameSchema in this case, I consistently get a "no candidates matching" error because the model returns an object with a description instead of a string as specified (see second screenshot).

Steps to Reproduce Use both describe and nullish on a schema. I haven't tested exhaustively, so it could be limited to certain schema types or situations.

Expected Behavior With .describe(...) but without .nullish(), the model correctly outputs the expected result of a string array (see third screenshot). Maybe .nullish() isn't valid on array elements, or we should otherwise be avoid using it in this context, but some kind of error messaging would be helpful in that case 👍

Screenshots listobjects 1 5flasherror correctoutput

Thanks in advance!