mmlu Search Results - Githubissues

1000+ results
for mmlu

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

hiyouga/LLaMA-Factory #5881

如何离线eval自己的数据集？

### Reminder - [X] I have read the README and searched the existing issues. ### System Info 有inference的代码，但是inference是在线推理。目前只看到了eval mmlu_test, ceval_validation, cmmlu_test的code，那么如何离线eval自己定义的dat…

GasolSun36 updated 3 days ago
3
declare-lab/instruct-eval #4

Add zero-shot evaluation results

Hi all, I read the code and realized that the results were obtained from 3-shot demonstrations. However, some models were trained to follow instructions without demonstrations. These models may have b…

LeeShiyang updated 1 year ago
1
irthomasthomas/undecidability #903

[2310.02170] Dynamic LLM-Agent Network: An LLM-agent Collabo…

- [ ] [[2310.02170] Dynamic LLM-Agent Network: An LLM-agent Collaboration Framework with Agent Team Optimization](https://arxiv.org/abs/2310.02170) # [2310.02170] Dynamic LLM-Agent Network: An LLM-…

ShellLM updated 2 months ago
1
TsinghuaC3I/Intuitive-Fine-Tuning #2

关于实验的一些疑问

你好，感谢分享IFT部分的代码，这边做了一些实验，有一些疑问。 1. 首先我只用了embedding fusion那块，然后发现gsm8k和truthfulqa的效果有提升，其他的基本差不多 2. 然后我加上了dynamic relation propagation，发现有些指标有提升，但是gsm8k和mmlu都不太好 3. 我发现论文中学习率是5e-7，而我之前设置的是2e-5，进行了调整…

peterjc123 updated 4 months ago
10
modelscope/evalscope #154

评估代码eval 运行如何不能本地加载数据集？

### examples/train_lora/llama3_lora_eval.yaml ### model model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct ### method finetuning_type: full ### dataset task: mmlu_test template: fewsho…

yawzhe updated 2 weeks ago
1
redhat-et/ilab-on-ocp #45

Add Final evaluation tasks, mt_bench_branch & mmlu_branch

Now that 2nd phase of training is complete, we can add the final evaluation tasks for the candidate model.

sallyom updated 4 weeks ago
2
modelscope/evalscope #147

基础模型评测（qwen2-7b-chat）报错

无法使用humaneval评测集是为什么 **脚本：** ASCEND_RT_VISIBLE_DEVICES=0 \ swift eval \ --model_type qwen-7b-chat \ --eval_dataset humaneval \ --infer_backend pt \ --eval_backend Native \ **…

ljh567 updated 6 hours ago
2
mlfoundations/dclm #86

Using Evaluation Prompts to Inform Data Selection

In the DataComp paper (original one for VLM's), some of the heuristics were based on features from the datasets that were used for evaluations. Is this permitted in the filtering track for DCLM? For e…

arnavmdas updated 1 week ago
1
EleutherAI/lm-evaluation-harness #1829

eval gsm8k from local dataset folder with the bug info "Valu…

i have the same problems with this issue ( https://github.com/EleutherAI/lm-evaluation-harness/issues/1347 ) i just want to eval gsm8k from local dataset folder, as the web in China can't access h…

Jp-17 updated 1 month ago
4
open-compass/opencompass #1304

运行评测结果为空[Bug]

### 先决条件 - [X] 我已经搜索过 [问题](https://github.com/open-compass/opencompass/issues/) 和 [讨论](https://github.com/open-compass/opencompass/discussions) 但未得到预期的帮助。 - [X] 错误在 [最新版本](https://github.com/open-com…

badmic updated 2 months ago
4

上一页 1...17 18 19 20 21 22 23...100 下一页

1000+ results for mmlu

1000+ results
for mmlu