mmlu Search Results - Githubissues

1000+ results
for mmlu

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

modelscope/evalscope #156

评估base模型出错

1. 首先我想要评估qwen2.5-3B base模型 2. base模型的模板是否直接填 qwen就行了还是说需要 generation 3. 本地模型加载已下载 4.不联网如果我需要评估本地cmmLu ceavl mmlu 数据集已下载，mmlu-test下的csv文件 5.base模型评估需要few_shot, 这里面需要设置吗？可以设置吗？不能设置，我该怎么添加， 6.…

yawzhe updated 1 week ago
2
jxiw/MambaInLlama #11

Why doesn’t kl_div ignore -100 in pseudo labels?

The original codes looks like below: kl_loss = F.kl_div(F.log_softmax( student_logits, dim=-1), targets, reduction='batchmean') Although the relative loss curve is the…

yynil updated 1 month ago
7
AkihikoWatanabe/paper_notes #1076

Take a Step Back: Evoking Reasoning via Abstraction in Large…

# URL - https://arxiv.org/abs/2310.06117 # Affiliations - Huaixiu Steven Zheng, N/A - Swaroop Mishra, N/A - Xinyun Chen, N/A - Heng-Tze Cheng, N/A - Ed H. Chi, N/A - Quoc V Le, N/A - Den…

AkihikoWatanabe updated 7 months ago
2
EleutherAI/lm-evaluation-harness #1362

Hello, I would like to know if there is a method to use "gen…

noforit updated 9 months ago
1
rmusser01/tldw #316

ModuleNotFoundError: No module named 'datasets'

**Are You on the Latest version?** You did a git pull and are running the latest version/build? Yes! **Please describe the bug** Upon double-clicking Windows_Run_tldw.bat, an error message appea…

IvanIVGrozny updated 1 month ago
1
AkihikoWatanabe/paper_notes #1467

What Matters in Transformers? Not All Attention is Needed, S…

# URL - https://arxiv.org/abs/2406.15786 # Affiliations - Shwai He, N/A - Guoheng Sun, N/A - Zheyu Shen, N/A - Ang Li, N/A # Abstract - While scaling Transformer-based large language models …

AkihikoWatanabe updated 2 weeks ago
2
homebrewltd/ichigo #59

experiment: Segmented Training to Recover MMLU

## Problem From @tikikun After benchmarking the pretraining checkpoint on MMLU, we observed a significant degradation in the model's text capabilities. The introduction of new multilingual data c…

0xSage updated 1 month ago
12
instructlab/sdg #259

No feedback from ilab data generate

**Describe the bug** When I run `ilab data generate` there is no update or output like 0.17.1. ``` (venv-instructlab-3.11) ➜ instructlab ilab data generate INFO 2024-08-08 16:00:04,437 numexpr.utils…

jjasghar updated 2 months ago
7
microsoft/promptbase #5

Clarification needed in evaluation numbers

Hello, Thanks for the repo and awesome work. I am requesting clarification on the evaluation results shown in the repo. For humanEval Zero shot, GPT-4's score is reported here as 87.4 but in th…

saurabhkumar8112 updated 10 months ago
5
hiyouga/LLaMA-Factory #5881

如何离线eval自己的数据集？

### Reminder - [X] I have read the README and searched the existing issues. ### System Info 有inference的代码，但是inference是在线推理。目前只看到了eval mmlu_test, ceval_validation, cmmlu_test的code，那么如何离线eval自己定义的dat…

GasolSun36 updated 2 days ago
3

上一页 1...16 17 18 19 20 21 22...100 下一页

1000+ results for mmlu

1000+ results
for mmlu