mmlu Search Results - Githubissues

880 results
for mmlu

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

EleutherAI/lm-evaluation-harness #2296

Low GPU Utilization During Multi-GPU evaluation - Efficiency…

Hello, I want to express my gratitude for your outstanding work. The powerful lm-evaluation-harness and your continuous maintenance have made LLM-evaluation much more convenient. However, I hav…

yang3121099 updated 13 hours ago
1
huggingface/lighteval #61

Add single `mmlu` config for `lighteval` suite

Currently it seems that to run MMLU with the `lighteval` suite, one needs to specify all the subsets individually as is done for leaderboard task set [here](https://github.com/huggingface/lighteval/bl…

lewtun updated 6 months ago
1
unslothai/unsloth #493

Support for Octpus LLM

Hi guys!, It will be nice to add support to Octopus LLMs or are they any alternative? The MMLU score of Octopus v4 is 74.8% under 5-shot, very impressive for such a small model! Octopus is based on P…

avcode-exe updated 3 months ago
4
open-compass/opencompass #1181

[Bug] 使用api测评时mode参数不起作用，超出max_seq_len并没有按mode切分输入

### 先决条件 - [X] 我已经搜索过 [问题](https://github.com/open-compass/opencompass/issues/) 和 [讨论](https://github.com/open-compass/opencompass/discussions) 但未得到预期的帮助。 - [X] 错误在 [最新版本](https://github.com/open-com…

wz0424 updated 2 months ago
3
microsoft/onnxruntime-genai #590

Memory leak during back-to-back inferences

I am experiencing a memory leak while running my application, which is to run an MMLU accuracy test on my Radeon 780M iGPU via DirectML. Each inference adds tens-hundreds of megabytes to the total …

jeremyfowers updated 5 days ago
19
homebrewltd/research #15

Epic: Test script

We need to set the test script for our training pipeline. - Data generation: @hungphongtrn - [ ] Check the audio generated (audio match the prompt) - [ ] Check the integrity of audio files wi…

hahuyhoang411 updated 1 month ago
1
EleutherAI/lm-evaluation-harness #2121

RecursionError: maximum recursion depth exceeded while calli…

When I run the code "CUDA_VISIBLE_DEVICES=3 TRANSFORMERS_OFFLINE=1 lm_eval --model hf --model_args pretrained=/public/MountData/yaolu/LLM_pretrained/LLAMA2_7B/,trust_remote_code=True --tasks mmlu,cm…

yaolu-zjut updated 1 month ago
2
artidoro/qlora #181

How were numbers in Table 5 generated?

Hi, reading the [QLoRA paper](https://arxiv.org/pdf/2305.14314.pdf), you folks are reporting the results on MMLU test set in Table 5: ![image](https://github.com/artidoro/qlora/assets/44957968/cffd7c…

kogolobo updated 1 year ago
1
EleutherAI/lm-evaluation-harness #2255

What are `mmlu_continuation` and `mmlu_generative`?

What are `mmlu_continuation` and `mmlu_generative`? Where can I find their description? I am going to test `mmlu` in the `cloze` way. Like the following illustration: ![image](https://github.com/…

shizhediao updated 2 weeks ago
5
EleutherAI/lm-evaluation-harness #2069

Evaluate Gemma with Chat Template

Hi, I'm trying to evaluate `gemma-it` models from Hugging Face on MMLU. When I set `--apply_chat_template --fewshot_as_multiturn`, the tokenizer will raise an error below. This is because Gemma does n…

pyf98 updated 1 week ago
3

上一页 1...3 4 5 6 7 8 9...88 下一页

880 results for mmlu

880 results
for mmlu