Arize-ai / phoenix

AI Observability & Evaluation
https://docs.arize.com/phoenix
Other
3.32k stars 237 forks source link

[ENHANCEMENT] build golden datasets or manual evals #3249

Open sraibagiwith100x opened 3 months ago

sraibagiwith100x commented 3 months ago

Is your feature request related to a problem? Please describe. Annotate via phoenix app to build golden datasets or manual evals

Describe the solution you'd like

Was wondering if span or trace annotation for dataset creation and/or evaluation is on Phoenix roadmap at all? We are already using phoenix for a lot of the heavy lifting with tracing and visualizing traces however we are still exporting these traces out and manually converting them into datasets for eval/examples optimizations/etc What would be awesome is if there was a way for me too add manual annotations , rewrite expected output, etc when reviewing a span (see screenshot below) Current plan is to load phoenix traces offline and annotate via doccano PS is this available in managed arize? CleanShot 2024-05-20 at 12 38 20

Describe alternatives you've considered Doccano

Additional context Add any other context or screenshots about the feature request here.

mikeldking commented 2 months ago

Hey @sraibagiwith100x thanks for the request! We are definitely working on #2513 in the next few months. Overall you probably can pull data from phoenix using the query api and then adding labels back.

We are currently working on #2017 as well, which could be used as a sort of annotation queue of sorts. Happy to run you through a preview in the coming weeks if you'd be interested.