evals Search Results - Githubissues

AnswerDotAI/bert24 #43

EVAL: Long-context evals

Description TBD -- Adding this to keep track

bclavie updated 2 weeks ago

mozilla/firefox-translations-training #697

Publish group logs evals in online mode

Related to #575. We discussed that it should be easy to populate the group_logs evals table row by row from different evaluation tasks.

eu9ene updated 1 week ago

UKGovernmentBEIS/inspect_ai #48

Support for evals within a module

I like to write my applications as modules, and I like to place my tests inside my modules so that I can easily import code (and not have to have the module on the path). Lots of test frameworks suppo…

adrianlyjak updated 2 weeks ago

eugeneyan/eugeneyan-comments #82

https://eugeneyan.com/writing/evals/

# LLM Task-Specific Evals that Do & Don't Work Evals for classification, summarization, translation, copyright regurgitation, and toxicity. [https://eugeneyan.com/writing/evals/](https://eugeneyan.c…

utterances-bot updated 3 weeks ago

huggingface/lighteval #211

Dataset loading issue for german_rag_evals on Windows

Hello, I don't know what I'm doing wrong. I received the following error as indicated in the title. My input was as shown on this website: : [Hugging Face - Ger-RAG-eval](https://huggingface.co/da…

Pommel4711 updated 2 days ago

embeddings-benchmark/mteb #824

Add support for saving embeddings in evals

@gmittal Currently, the `save_predictions` flag allows for the saving of query similarity predictions to a json file. However, I wish to have a separate flag to save the embeddings computed, say as a…

dhruvbpai updated 1 week ago

Arize-ai/phoenix #1739

[ENHANCEMENT] : Evals Span Type

Can we add a Evals span type for Open AI tracing when Phoenix Evals library runs? We are using it for tracking OpenAI calls for the Evals library and would be great to show the types of spans.

jlopatec updated 3 weeks ago

haesleinhuepf/human-eval-bia #1

Ideas for evals

List of tasks, these were initially drawn for Omega, but can be adapted as functions for the purpose of this work: * [x] Convert RGB image to gray scale with configurable weights [see](https://git…

royerloic updated 2 months ago

shap/shap #3654

Questions: question about SamplingExplainer

### Problem Description Hi, everyone! I check the code of function `sampling_estimate`. Assume we have a data instance `x` with `M` features. - We keep the 1st to j-th feature as original, replace …

vectorsss updated 1 month ago

zeno-ml/zeno-hub #733

Integration with OpenAI Evals

OpenAI Evals is very popular solution for working with Evals, and it's growing each day (just look at the repo stars). You should have a clear way to show how to integrate your product with the…

JuanmaMenendez updated 2 months ago

1000+ results for evals

1000+ results
for evals