issues
search
modelscope
/
evalscope
A streamlined and customizable framework for efficient large model evaluation and performance benchmarking
https://evalscope.readthedocs.io/en/latest/
Apache License 2.0
263
stars
33
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
采用 Openai server 测试,能够产生 inference 结果,但是 eval 输出结果为空
#208
HaoWuSR
opened
19 hours ago
1
为啥提高并发和请求数反而速度会变快,正常吗?
#207
z1054136399
opened
1 day ago
3
Add timeout for download punkt.zip
#206
Yunnglin
closed
1 day ago
0
compact ragas v0.2.5 and update readme
#205
Yunnglin
closed
15 hours ago
0
perf统计性能数据问题
#204
weibingo
opened
1 week ago
3
测试ragas时报错, nomodule named "evalscope.backend.rag_eval_utils'
#203
goudemaoningsir
opened
1 week ago
2
bbh测试集之前可以测,现在测不了了
#202
charliedream1
opened
1 week ago
2
使用python代码如何统一指定离线data目录?
#201
charliedream1
opened
1 week ago
0
性能测试开启流式--stream时卡住,无法结束
#200
simonqian
opened
1 week ago
18
update oc docs
#199
Yunnglin
closed
1 week ago
0
Add cmmlu
#198
wangxingjun778
closed
1 week ago
0
opencompass数据集中safety板块数据集支持
#197
A-Zi-cong-xiao-jiu-hen-ke-ai
opened
1 week ago
2
fix #192 and compact mteb v1.19
#196
Yunnglin
closed
1 week ago
0
评估提示模板
#195
lizhen-lizhen
opened
1 week ago
1
generate-ragas: Aborted request
#194
jackqdldd
closed
1 week ago
16
ragas 评测:httpx.ConnectError: All connection attempts failed
#193
jackqdldd
closed
1 week ago
1
执行mteb二阶段检索评估报错
#192
A-cracker
closed
1 week ago
7
OpenCompass打印的测试集列表和网址给出的不一致,导致无法测试,比如cmmlu
#191
charliedream1
opened
1 week ago
11
RAG测试样例中的测试集在什么地方下载,能否提供一下
#190
charliedream1
closed
1 week ago
4
能否给个longwriter和toolbench使用openai接口测试的例子
#189
charliedream1
opened
1 week ago
0
module 'evaluate' has no attribute 'load'
#188
charliedream1
opened
1 week ago
3
Set pyarrow version
#187
wangxingjun778
closed
2 weeks ago
0
Add publish workflow
#186
wangxingjun778
closed
2 weeks ago
0
update version
#185
wangxingjun778
closed
2 weeks ago
0
Set datasets version
#184
wangxingjun778
closed
2 weeks ago
0
evalscope 5.5 stream.日志不起作用
#183
jinweida
opened
2 weeks ago
1
我们evalscope支持多模态大模型的性能压测么
#182
ouyongqi
opened
2 weeks ago
4
输出结果没有分数
#181
Leo20100307
opened
2 weeks ago
6
remove leaderboard in readme
#180
wangxingjun778
closed
2 weeks ago
0
这些wandb指标数据可以获取到么
#179
ouyongqi
closed
2 weeks ago
4
Refactor the `perf` module
#178
Yunnglin
opened
2 weeks ago
0
模型性能压测-evalscope perf-报错超时
#177
NiYueLiuFeng
opened
2 weeks ago
2
执行cmtb评估示例代码example_eval_mteb.py报错pydantic_core._pydantic_core.ValidationError: 2 validation errors for TaskMetadata
#176
A-cracker
closed
2 weeks ago
1
feat: save knowledge graph
#175
Yunnglin
closed
2 weeks ago
0
Remove BBH temporarily due to import issue and update readme
#174
wangxingjun778
closed
3 weeks ago
0
fix generation config None support
#173
Yunnglin
closed
3 weeks ago
0
add blog for multimodal RAG evaluation
#172
Yunnglin
closed
3 weeks ago
0
fix generation config args and compact ragas 0.2.3
#171
Yunnglin
closed
3 weeks ago
0
RAGAS评测- No module named 'evalscope.backend.rag_eval.utils
#170
jackqdldd
closed
2 weeks ago
38
Windows 11 下执行异常
#169
Devliang24
opened
3 weeks ago
1
测试的结果数据统计好像有问题,为什么 首token 平均时间和每个包平均时延 平均时延 是一样?
#168
Devliang24
closed
2 weeks ago
1
若设置最大输出tokens,想要让模型每次请求输出tokens数都打满这个最大输出token的值,或者接近这个值,需要怎么设置嘛?
#167
Devliang24
closed
2 weeks ago
7
update readme and vlmevalkit datasets
#166
Yunnglin
closed
3 weeks ago
0
update rag template
#165
Yunnglin
closed
3 weeks ago
0
update clip embedding text
#164
Yunnglin
closed
3 weeks ago
0
add clip encode text truncation
#163
Yunnglin
closed
4 weeks ago
0
输出日志显示字符乱码,预期显示中文字符
#162
Devliang24
closed
2 weeks ago
2
Fix #159, change request format
#161
lxline
closed
3 weeks ago
0
带参数模版的使用方法,这样有问题么?
#160
Devliang24
closed
3 weeks ago
4
压测vllm接口报错
#159
ZTurboX
closed
3 weeks ago
2
Next