-
### 描述该功能
请问我们提供的数据集为什么有这么多的版本?是内容有更新还是其他的原因?怎么确定评测一个模型的时候使用哪个版本呢?谢谢
![image](https://github.com/user-attachments/assets/e1107a05-5add-4a63-8680-7e8e3496720d)
### 是否希望自己实现该功能?
- [ ] 我希望自己来实现这一功能…
-
In the compute_accuracy function in eval_mmlu.py, there is a line of code on line 86 that reads `if pred_answer is None: return 1`. However, if pred_answer is None, shouldn't the function return 0 ins…
-
**❗BEFORE YOU BEGIN❗**
Are you on discord? 🤗 We'd love to have you asking questions on discord instead: https://discord.com/invite/a3K9c8GRGt
**Describe the bug**
I have followed the page of "htt…
-
command:
accelerate launch run_evals_accelerate.py --model_args="Llama-2-7b-chat-hf-8bit,quantization_config="load_in_8bit=True"" --tasks "helm|hellaswag|1|0" -- --output_dir ./evalscratch
Resul…
-
hi guys, thanks for sharing high quality hackable codebase !
ive just wondered how 7B llama with dclm 1T tokens can achieve 60++ mmlu score.
to my best knowledge, it should consume enough flops (like …
-
Hi,
1.
When I use the command on 8 gpus:
```
python3 qalora.py --model_path $llama_7b_4bit_g32
```
it will show the error:
```
File "/home/shawn/anaconda3/envs/qalora/lib/python3.8/site-pa…
-
### System Info
GPUs: A100, 4 GPUs (40 GB memory)
Release: tensorrt-llm 0.9.0
### Who can help?
@Tracin
### Information
- [X] The official example scripts
- [ ] My own modified scrip…
ghost updated
3 months ago
-
Currently it seems that to run MMLU with the `lighteval` suite, one needs to specify all the subsets individually as is done for leaderboard task set [here](https://github.com/huggingface/lighteval/bl…
-
Hi, reading the [QLoRA paper](https://arxiv.org/pdf/2305.14314.pdf), you folks are reporting the results on MMLU test set in Table 5:
![image](https://github.com/artidoro/qlora/assets/44957968/cffd7c…
-
### 描述该功能
求配置mmlu_pro数据集的代码逻辑~
### 是否希望自己实现该功能?
- [ ] 我希望自己来实现这一功能,并向 OpenCompass 贡献代码!