firebase / genkit

An open source framework for building AI-powered apps with familiar code-centric patterns. Genkit makes it easy to develop, integrate, and test AI features with observability and evaluations. Genkit works with various models and platforms.
Apache License 2.0
802 stars 121 forks source link

[FR] Add prompt name to the trace fields #651

Closed i2amsam closed 2 months ago

i2amsam commented 4 months ago

When using prompt files or defined prompts it would be useful to know which prompt was used for a generation

Describe the solution you'd like [] Add the name of the prompt to the trace data when the prompt is used for a generate call [] Show the name of the prompt in the developer UI's traces [] Allow the name of the prompt to be used in filtering in observability tools like Cloud Logging and Firebase AI Monitoring

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context User request:

image

Discussed briefly with @pavelgj and @tonybaroneee

i2amsam commented 3 months ago

Adding to this, it would also be great if the prompt step also showed the input and output for the prompt generate call

maxl0rd commented 3 months ago

I believe the appropriate solution to this is to modify the dotprompt plugin so that it calls its defined prompt action when it renders the prompt. This should create a trace span for the prompt rendering step.

This is useful to indicate which prompt was used and also to locate any potential bugs in the rendering of the template.

tonybaroneee commented 3 months ago

@maxl0rd I could be confusing things, but will the trace span for the prompt rendering step be different than the one that we currently have when the user is typing into the prompt runner and we're firing off a bunch of generate calls to render the hydrated prompt in the Dev UI? (Those actions of which we're currently filtering out in the list of traces)

maxl0rd commented 3 months ago

It would be the same action. Are we not writing those traces or not retrieving them? That might be a problem.

tonybaroneee commented 3 months ago

Oh, I think we aren't writing them. Remember https://github.com/firebase/genkit/pull/60? :)

maxl0rd commented 3 months ago

Ok right, the filter only affects LocalFileTraceStore. This is more about production observability, so I think it's still the right approach.

tonybaroneee commented 3 months ago

Ah good call, forgot about the environment distinction there 👍

maxl0rd commented 3 months ago

Addressed in this PR https://github.com/firebase/genkit/pull/785

Using the action proved to be problematic for a number of reasons. I think the most straightforward approach is for dotprompt to create its own trace spans when rendering prompts. This might seem slightly superfluous with the changes in next where generate also opens a new span. But I think it's important for observability of prompt changes which are a big deal for many users.