mmlu Search Results - Githubissues

1000+ results
for mmlu

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

EleutherAI/lm-evaluation-harness #1751

Accuracy gap between single GPU and multiple GPUs

I'm using lm-eval v0.4.2 to evaluate Llama 7b on the open llm leaderboard benchmark. I found that there are accuracy gaps between single GPU and multiple GPUs as below. (I used data parallel) | |…

HsuWanTing updated 6 months ago
3
vllm-project/vllm #3292

lm-evaluation-harness broken on master

Since https://github.com/vllm-project/vllm/pull/3065, the eval suite https://github.com/EleutherAI/lm-evaluation-harness is broken. Repro (this should be run on 2 A100s or H100s to make sure the Mi…

pcmoritz updated 1 month ago
2
EleutherAI/lm-evaluation-harness #1875

Evaluation MC Questions

I have a question regarding evaluating LLMs on mc questions using loglikelihood of tokens. From existing implementations like MMLU etc, the code snippet would look like this: ``` # Create the model …

kangqi-ni updated 5 months ago
2
huggingface/lighteval #365

[FT] Using lighteval to evaluate a model on a single sample,…

Thank you the team for the great work. I have a question. Can you please help me to use lighteval to evaluate a model on a single sample? For example, if I have an input from mmlu I, my model gene…

dxlong2000 updated 2 weeks ago
6
open-compass/opencompass #1277

[Bug] 多卡时，GPU7显存占用比其他卡多30G+

### 先决条件 - [X] 我已经搜索过 [问题](https://github.com/open-compass/opencompass/issues/) 和 [讨论](https://github.com/open-compass/opencompass/discussions) 但未得到预期的帮助。 - [X] 错误在 [最新版本](https://github.com/open-…

dhcode-cpp updated 2 months ago
4
EleutherAI/lm-evaluation-harness #1743

accelerate doesn't work with auto:(>1)

Hi. I realized that accelerate launch works perfectly when I set batch_size = "auto" but gets stuck at the very end when I use batch_size = "auto:2". The problem persists whether I use evaluator.simpl…

ozgurcelik updated 6 months ago
4
TIGER-AI-Lab/MMLU-Pro #24

Paper claims there are 10-choices but the test split has var…

Hi folks, thanks for creating the dataset. In your paper and the dataset card, you claim that MMLU-PRO has 10 choices for each question which seems to be false. By opening the Viewer tab, and select…

eldarkurtic updated 3 days ago
6
instructlab/instructlab #2110

Not able to train the model on Mac M1 because of Adapter fil…

**Describe the bug** I did `ilab model train`, but `ilab model test` failed with ``` OTE: Adapter file does not exist. Testing behavior before training only. - /Users/ahmedazraq/Library/Application…

ahmed-azraq updated 2 months ago
1
open-compass/opencompass #1587

[Bug] 评估过程中的oom 问题

### 先决条件 - [X] 我已经搜索过 [问题](https://github.com/open-compass/opencompass/issues/) 和 [讨论](https://github.com/open-compass/opencompass/discussions) 但未得到预期的帮助。 - [X] 错误在 [最新版本](https://github.com/open-com…

tonney007 updated 2 weeks ago
4
nlp-uoregon/mlmm-evaluation #6

Doesn't work with any HF model

Hello, I've been trying with different LLMs but I haven't been able to make it works. Could you bring some light? ```shell luispoveda93@LUIS-PC:~/mlmm-evaluation$ bash scripts/run.sh es micro…

PovedaAqui updated 4 months ago
1

上一页 1...19 20 21 22 23 24 25...100 下一页

1000+ results for mmlu

1000+ results
for mmlu