Open axiomofjoy opened 3 months ago
Because evals for docs and spans are on different spans, this is essentially blocked, unless we can apply a filter for evals on the root span and still include sub-spans (i.e. where the doc evals are) in the calculation of document evaluation summaries.
Users debugging RAG want to filter their spans to know whether the fault lies with the retriever or the LLM. We need a way for the user to filter document evaluations using our query DSL so they can write filters such as:
https://arize-ai.slack.com/archives/C04R3GXC8HK/p1722442515198979?thread_ts=1722436956.324489&cid=C04R3GXC8HK