issues
search
hkust-nlp
/
ceval
Official github repo for C-Eval, a Chinese evaluation suite for foundation models [NeurIPS 2023]
https://cevalbenchmark.com/
MIT License
1.64k
stars
78
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Submit your results to C-Eval多久会有结果
#88
zellelin
closed
1 week ago
1
C-Eval排行榜已经填写问卷,排行榜什么时候可以刷新
#87
Waneila
closed
3 weeks ago
1
示例代码无法正常加载数据集
#86
ningmenghongcha
closed
2 months ago
1
上传模型评测结果到网页上报错 Any subject must contain at least 50 questions to be calculated
#85
mmarzl17
closed
2 months ago
1
发现几个问题好像标注答案不太对
#84
zky001
opened
2 months ago
1
为什么test数据集里没有正确答案的信息呢
#83
nanzhaogang
closed
3 months ago
1
您好,榜单什么时候更新,我们提交了三天了,能麻烦更新下吗
#82
Zheng-Jay
closed
3 months ago
1
申请公开榜单
#81
lingbaishun
closed
4 months ago
1
C-Eval排行榜提交,排行榜什么时候可以刷新
#80
Waneila
closed
5 months ago
1
测试结果提交报错,点击process时,提示:Any subject must contain at least 50 questions to be calculated
#79
hailing2024
closed
5 months ago
1
Failed to send verification code. Please try again.
#78
Wumengyao
closed
8 months ago
18
Leaderboard Update
#77
better-one
closed
7 months ago
1
The leaderboard in GitHub is out of sync with the latest version on the official website
#76
zhimin-z
closed
8 months ago
1
Leaderboard Update
#75
13416157913
closed
10 months ago
0
HOW TO EVALUATE STEM???
#74
AkKari808
closed
11 months ago
1
Failed to send verification code. Please try again.
#73
Lukikay
closed
11 months ago
2
Leaderboard Update
#72
lijiahuan-01
closed
11 months ago
0
Fail to send verification code. Please try again
#71
Abigail61
closed
11 months ago
3
C-Eval榜单从提交评测到榜单上能看到成绩大概需要多久?
#70
blueseasky
closed
11 months ago
0
什么时候更新榜单呢?
#69
matrix-yang
closed
11 months ago
1
gpt-4-1106-preview 有人测试过 test 的分数吗?
#68
theblackcat102
opened
12 months ago
3
Leaderboard Update
#67
chn91127
closed
11 months ago
1
Leaderboard Update
#66
chn91127
closed
11 months ago
0
public display
#65
jiahui098
closed
1 year ago
1
请问chatglm3-6b-base发布在哪里?
#64
yayaQAQ
closed
1 year ago
1
llama和其他模型评测时不同点
#63
Chandler-Bing
closed
1 year ago
1
##
#62
Kevin-KWH
closed
1 year ago
0
模型是否真正掌握了相关知识而不是在猜答案?
#61
yucc-leon
closed
1 year ago
3
Atom-13B不是公开访问的模型
#60
TerraceCN
closed
1 year ago
2
自然语言处理的相关任务属于知识型还是推理型任务呢?
#59
liumingzhu6060
closed
1 year ago
1
官方示例加载数据集报错
#58
JensenDong
opened
1 year ago
8
You guys posted a hilarious Leaderboatd on your official website
#57
stephonye
closed
1 year ago
1
只能单选吗?可以多选吗?
#56
xxm1668
closed
1 year ago
1
申请公开
#55
huayicong23
closed
1 year ago
4
public display
#54
ZHangZHengEric
closed
1 year ago
2
为什么我用c-eavl测试chatglm2-6B 在zero-shot 下的分数很低?
#53
EdisonWujr
opened
1 year ago
5
测试集中的部分错误。
#52
hanjr92
closed
1 year ago
4
prompt行尾含有空格会发生什么?为什么不能有空格
#51
cangyi071
closed
1 year ago
1
请问模型公开结果需要做哪些动作呀?
#50
xyzhou-puck
closed
1 year ago
1
看不懂怎么用。。eval_llama.py是给基于llama的模型用的吗,有很多报错不知道怎么解决
#49
starevelyn
opened
1 year ago
3
how to evaluate models trained by bloom serires base model?
#48
Modas-Li
closed
1 year ago
1
C-Eval 提交规则限制
#47
suolyer
closed
1 year ago
3
chatglm2-6b在valid set上的zero-shot结果似乎有问题
#46
ylwangy
closed
1 year ago
4
官网无法登录,无法提交答案
#45
wuliaoren05
closed
1 year ago
0
lm-evaluation-harness 是用test集测评的吗?
#44
ChangyuanWu
closed
1 year ago
1
提交结果问题
#43
18811449050
closed
1 year ago
7
Problematic question in test set
#42
wgb14
closed
1 year ago
1
关于确认CEval可以被hack之后的计划
#41
yucc-leon
opened
1 year ago
3
middle_school_history_test.csv 中有题目错误
#40
AiLMe-AI
closed
1 year ago
2
结果提交的疑问
#39
renmengjie7
closed
1 year ago
2
Next