evals Search Results - Githubissues

1000+ results
for evals

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

firebase/genkit #985

[Evals] Evaluation docs improvements

ssbushi updated 4 days ago
1
EleutherAI/lm-evaluation-harness #2338

Locally reproducible HF-Leaderboard evals

Hi folks, I am trying to run HF-Leaderboard (v2) evals locally, and according to the blog https://huggingface.co/spaces/open-llm-leaderboard/blog the scores are normalized and random prediction acc…

eldarkurtic updated 3 days ago
3
google/mesop #967

AI: Prompt Versioning for Evals?

Not 100% sure, but right now it seems like if I updated a prompt fragment(s), these changes will get propagated to other producers. This is slightly problematic for evals, because ideally we'd have a …

richard-to updated 2 weeks ago
1
ErikBjare/gptme #63

Benchmarks/evals

I did some smaller benchmarks (more like tests, really) and would like to continue with this endeavor to evaluate capabilities and weak spots. Would also be interesting to test on codegen tasks vs …

ErikBjare updated 5 days ago
1
ComputerScienceHouse/conditional-react #34

Intro Evals Form

Form for frosh to provide a list of social events they have attended and provide other comments for reference during 6 weeks

esoccoli updated 1 day ago
1
defenseunicorns/leapfrogai #721

(spike) Evals "Model Card"

## Description How the evaluation results get delivered is crucially important. This spike covers what a "model card" would look like for evaluating a model against our framework. The "model card" sh…

jalling97 updated 5 days ago
2
MadcowD/ell #285

Evaluations in ell

This is a major feature release. Spec: https://github.com/MadcowD/ell/blob/cd64ab9bb0d3a09195fef7a32ef77ac5d7e6c912/docs/ramblings/evalspec.md Ramblings: https://github.com/MadcowD/ell/blob/cd64ab9…

MadcowD updated 2 hours ago
2
IBM/controlled-peptide-generation #7

sample_pipline.py&evals/peptide_evals.py

Hey, your work is excellent! But I have a question about your sample_pipline.py,you construct an object sample_pipline， but never call it， and lack the parameter path， are you missing this part of the…

Alyssa-Lai-3B updated 1 month ago
1
JuliaCI/BenchmarkTools.jl #374

Default parameter option `evals_set` is not documented

Hi there, maybe I miss something completely, but I do not understand the meaning of `samples` and `evals` for the benchmarks. If I run this code in REPL ``` using BenchmarkTools BenchmarkTo…

pjaap updated 1 week ago
2
promptfoo/promptfoo #1772

Unable to generate shareable url for very large evals

**Describe the bug** Unable to generate share URL . Share button keeps showing infinite processing **To Reproduce** Steps to reproduce the behavior, including example Promptfoo configurations if …

mehnoorsiddiqui updated 1 day ago
4

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for evals

1000+ results
for evals