issues
search
mtbench101
/
mt-bench-101
[ACL 2024] MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues
Apache License 2.0
25
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
[Bug] When evaluation, {prediction} in origin_prompt is not replaced with model's response?
#8
liuyaox
closed
2 weeks ago
3
OpenCompass 实现提示词格式对人不友好
#7
Leymore
closed
3 weeks ago
2
Mtbench101 2oc
#6
sefira
closed
4 weeks ago
0
Mtbench101
#5
sefira
closed
4 weeks ago
0
Introducing the MT-Bench-101 Beta Version!
#4
sefira
closed
1 month ago
0
Call for code and data!
#3
zemerov
closed
1 month ago
1
论文都发表了,写在论文里,github仓库为空
#2
victorjiax
closed
1 month ago
1