-
### Reminder
- [X] I have read the README and searched the existing issues.
### System Info
有inference的代码,但是inference是在线推理。目前只看到了eval mmlu_test, ceval_validation, cmmlu_test的code,那么如何离线eval自己定义的dat…
-
Hi all, I read the code and realized that the results were obtained from 3-shot demonstrations. However, some models were trained to follow instructions without demonstrations. These models may have b…
-
- [ ] [[2310.02170] Dynamic LLM-Agent Network: An LLM-agent Collaboration Framework with Agent Team Optimization](https://arxiv.org/abs/2310.02170)
# [2310.02170] Dynamic LLM-Agent Network: An LLM-…
-
你好,感谢分享IFT部分的代码,这边做了一些实验,有一些疑问。
1. 首先我只用了embedding fusion那块,然后发现gsm8k和truthfulqa的效果有提升,其他的基本差不多
2. 然后我加上了dynamic relation propagation,发现有些指标有提升,但是gsm8k和mmlu都不太好
3. 我发现论文中学习率是5e-7,而我之前设置的是2e-5,进行了调整…
-
### examples/train_lora/llama3_lora_eval.yaml
### model model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct
### method finetuning_type: full
### dataset task: mmlu_test
template: fewsho…
-
Now that 2nd phase of training is complete, we can add the final evaluation tasks for the candidate model.
-
无法使用humaneval评测集是为什么
**脚本:**
ASCEND_RT_VISIBLE_DEVICES=0 \
swift eval \
--model_type qwen-7b-chat \
--eval_dataset humaneval \
--infer_backend pt \
--eval_backend Native \
**…
-
In the DataComp paper (original one for VLM's), some of the heuristics were based on features from the datasets that were used for evaluations. Is this permitted in the filtering track for DCLM? For e…
-
i have the same problems with this issue ( https://github.com/EleutherAI/lm-evaluation-harness/issues/1347 )
i just want to eval gsm8k from local dataset folder, as the web in China can't access h…
Jp-17 updated
1 month ago
-
### 先决条件
- [X] 我已经搜索过 [问题](https://github.com/open-compass/opencompass/issues/) 和 [讨论](https://github.com/open-compass/opencompass/discussions) 但未得到预期的帮助。
- [X] 错误在 [最新版本](https://github.com/open-com…