-
### 先决条件
- [X] 我已经搜索过 [问题](https://github.com/open-compass/opencompass/issues/) 和 [讨论](https://github.com/open-compass/opencompass/discussions) 但未得到预期的帮助。
- [X] 错误在 [最新版本](https://github.com/open-com…
-
copy/paste from [zimfarm/719](../../zimfarm/issues/719)
Some of our zim files have fairly long descriptions and we end up with a block of text. It would be convenient if in the Description field of…
-
The following sentences contain tokens that don't have multi-word token range annotations:
```
ERROR: Sentence GUM_reddit_macroeconomics-7 token 14 -- multi-word continuation without a multi-word to…
-
I have tested llama 2 13b and 70b on mmlu with 4.0 version. My 5-shots result of 70b is 0.632, it's not as good as the result of paper(0.68).
13b 0-shot
hf (pretrained=/nas/lili/models_hf/13B-ch…
-
Hi, team. I tried to use your implementation to compute MMLU scores for some models. But I found that for some model, result is werid.
For example, for llama2-13b, the script I use to test its few sh…
-
These are instances of nouns (NN) and proper nouns (NNPS) marked as plurals (Number=Plur) where the lemma is the plural form. Each of these (on a case by case basis) should either:
1. use the si…
-
### 先决条件
- [X] 我已经搜索过 [问题](https://github.com/open-compass/opencompass/issues/) 和 [讨论](https://github.com/open-compass/opencompass/discussions) 但未得到预期的帮助。
- [X] 错误在 [最新版本](https://github.com/open-com…
-
### 操作系统
Windows
### TeX 套件
TeXLive 2021 或更新的版本
### TeX Compiler
XeTeX
### zjuthesis 版本号
9.1.0
### MajorFormat
cs
### Degree
graduate
### Type
thesis
### Period
final
### BlindReview
…
-
Currently I am using hendrycksTest-* for measuring the performance on MMLU. However, the scores are reported for each single task. It would much more convenient if there are scores for each subcategor…
-
# Checklist
- [x ] I have used the search function to see if someone else has already submitted the same feature request.
- [ x] I will only create one feature request per issue.
- [ x] I will …