-
**Is your feature request related to a problem? Please describe.**
Annotate via phoenix app to build golden datasets or manual evals
**Describe the solution you'd like**
Was wondering if span o…
-
**Describe the bug**
Similarity Evaluator responds with a response Nan as first token in response is 'Text" and promptflow is not able to convert text to integer.
**How To Reproduce the bug**
St…
-
Each panel in the dashboard has a hard-coded eval to extract as human-readable time from epoch time. Build a macro that can accomplish this and update the dashboards to use it.
-
Hi, thank you for the wonderful paper and codebase! I had one clarification question: it looks like there is an extra set of forward passes for the SigLIP ViT blocks - is this intentional for the sigl…
-
I suspect that I need to specify evaluator_config for evaluate in order to map the data from the target response, but there's no example of it in the docstring or in https://pypi.org/project/promptflo…
-
I have been trying to extract data (title, question answered, entities, summary) from documents chunks.
I believed typed predictors would be good for this, but I keep running into "Too many retrie…
-
### Short description and motivation for the proposed feature
Idea from @Ricram2: doing `EVALUATE * FROM` would yield a table with all compatible accuracy metrics for the model being evaluated.
### …
-
get_qa_with_reference is always None where as get_retrieved_documents works fine.
Always get No spans found.
File "C:\anaconda3\Lib\site-packages\phoenix\evals\classify.py", line 354, …
-
In the config, if I've defined two prompts, or two providers, I see the side-by-side results in the Web UI.
What about the situation if I or someone else have run an eval on a single prompt or prov…
-
Details here https://github.com/openai/evals
Products that implement evals get priority access to GPT-4