evals Search Results - Githubissues

1000+ results
for evals

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

huggingface/lighteval #198

The helm|piqa task is generative but has generation_size=-1.

The `helm|piqa` task listed in `tasks_table.jsonl` here: https://github.com/huggingface/lighteval/blob/a98210fd3a2d1e8bface1c32b72ebd5017173a4c/src/lighteval/tasks/tasks_table.jsonl#L797C1-L797C472. …

yonatano updated 4 days ago
3
promptfoo/promptfoo #979

How to calculate something like F1 score?

If we are running classification tasks with LLM, how can we calculate overall precision, recall and F1 score from the evals? It is not clear if derived metrics allow us to do that. Any suggestions?…

manojlds updated 3 weeks ago
1
hashicorp/nomad-autoscaler #584

Autoscaler Nomad Plugin Doesn't Take Into Consideration CPU …

Hey everyone, this is an awesome project! However in using this we found a small issue with the npmad apm plugin Nomad now exposes the below metrics nomad.nomad.blocked_evals.cpu nomad.nomad.bl…

rorylshanks updated 3 weeks ago
3
HazyResearch/hyena-dna #62

Bugs when I try to access the embeddings

Hi, I met a bug to access the embeddings from hyenaDNA, especially for the code: /evals/hg38_inference.py Traceback (most recent call last): File "/gpfs/gibbs/pi/zhao/tl688/hyena-dna/evals/hg…

HelloWorldLTY updated 1 month ago
3
facebookresearch/jepa #43

KeyError when running evals.main

Thanks for your brilliant work! Having downloaded K400 pretrained checkpoint file(k400-probe.pth.tar) and modified the config yaml file for the corresponding dataset(specifying datapath), I ran evals.…

JPerAsperaadAstra updated 3 months ago
1
Arize-ai/phoenix #2299

add evals annotations to DSPy notebook

DSPy provides its own set of evaluation methods for evaluating compiled DSPy modules on dev sets, e.g., exact answer match and relevance. We can add these evaluations as annotations via `log_evaluatio…

axiomofjoy updated 2 months ago
1
codestoryai/swe_bench_traces #1

Where are the rest of the runs, and how do you get your accu…

Hi, cool project :) I took a look at the evals and noticed that there's only 127 eval files. Further, only 107 of them seem to pass the tests. Would it be possible for you to post the rest of th…

Naqu6 updated 1 week ago
2
carolinehays/SNAP-UX-Thesis #3

SNAP Evals

- Guide for assessing Food stamp app forms 2003 http://www.fns.usda.gov/sites/default/files/assessment-guide.pdf - guide for assessing online apps http://www.fns.usda.gov/sites/default/files/snap/Be…

carolinehays updated 8 years ago
2
carolinehays/SNAP-UX-Thesis #5

Ux Evals

carolinehays updated 8 years ago
12
microsoft/promptflow #3382

[BUG] Function `evaluate(...)` from promptflow.evals.evaluat…

**Describe the bug** I followed the example in MSDocs [Evaluate on test dataset using `evaluate()`](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/develop/flow-evaluate-sdk#evaluate-on-test…

megel updated 5 days ago
4

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for evals

1000+ results
for evals