-
Description TBD -- Adding this to keep track
-
Related to #575. We discussed that it should be easy to populate the group_logs evals table row by row from different evaluation tasks.
-
I like to write my applications as modules, and I like to place my tests inside my modules so that I can easily import code (and not have to have the module on the path). Lots of test frameworks suppo…
-
# LLM Task-Specific Evals that Do & Don't Work
Evals for classification, summarization, translation, copyright regurgitation, and toxicity.
[https://eugeneyan.com/writing/evals/](https://eugeneyan.c…
-
Hello, I don't know what I'm doing wrong. I received the following error as indicated in the title.
My input was as shown on this website: :
[Hugging Face - Ger-RAG-eval](https://huggingface.co/da…
-
@gmittal
Currently, the `save_predictions` flag allows for the saving of query similarity predictions to a json file. However, I wish to have a separate flag to save the embeddings computed, say as a…
-
Can we add a Evals span type for Open AI tracing when Phoenix Evals library runs? We are using it for tracking OpenAI calls for the Evals library and would be great to show the types of spans.
-
List of tasks, these were initially drawn for Omega, but can be adapted as functions for the purpose of this work:
* [x] Convert RGB image to gray scale with configurable weights [see](https://git…
-
### Problem Description
Hi, everyone! I check the code of function `sampling_estimate`.
Assume we have a data instance `x` with `M` features.
- We keep the 1st to j-th feature as original, replace …
-
OpenAI Evals is very popular solution for working with Evals, and it's growing each day (just look at the repo stars).
You should have a clear way to show how to integrate your product with the…