issues
search
modelscope
/
evalscope
A streamlined and customizable framework for efficient large model evaluation and performance benchmarking
Apache License 2.0
164
stars
23
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
update blog for openai o1 model evaluation
#131
Yunnglin
opened
11 hours ago
0
自定义vlm数据集,build_prompt(self, line) 没有执行
#130
jackqdldd
opened
14 hours ago
1
swift eval 执行报错: cannot import name 'ftp_head' from 'datasets.utils.file_utils'
#129
jackqdldd
opened
20 hours ago
1
evalscope perf 测试sglang 部署的openai api server 无法输出结果
#128
hetian127
opened
1 day ago
0
Add RAGEval banckend with `MTEB` banckmark
#127
Yunnglin
opened
1 day ago
0
Add blog: rag
#126
wangxingjun778
closed
1 day ago
0
update readme usage
#125
Yunnglin
closed
4 days ago
0
fix #122 and update docs
#124
Yunnglin
closed
4 days ago
0
请问,评估支持使用昇腾910NPU嘛?
#123
yiyayieryo
opened
4 days ago
3
AttributeError: can't set attribute 'split'
#122
Jeremy-J-J
closed
4 days ago
4
update `supported datasets` docs and readme
#121
Yunnglin
closed
1 week ago
0
没有结果
#120
lucheng07082221
closed
1 week ago
4
Release/0.2
#119
Chen9154
closed
2 weeks ago
0
Baseline模型对比模式结果错误
#118
stay-leave
opened
2 weeks ago
2
Add cmb
#117
wangxingjun778
closed
2 weeks ago
0
add custom dataset evaluation support and docs
#116
Yunnglin
closed
2 weeks ago
0
请教一下,如何使用openai兼容格式的大模型作为评估模型来进行两模型在自定义数据集上的评估任务呢?
#115
EvilCalf
opened
2 weeks ago
3
add same best practice and support ollama, vllm, lmdeploy model serving
#114
Yunnglin
closed
2 weeks ago
0
Add LongWriter evaluation
#113
wangxingjun778
opened
3 weeks ago
0
模型推理性能压测 evalscope perf 长时间没有返回
#112
undyingfame
closed
2 weeks ago
0
update readme and docs
#111
Yunnglin
closed
3 weeks ago
0
Refact readme
#110
Yunnglin
closed
3 weeks ago
0
Update news
#109
wangxingjun778
closed
3 weeks ago
0
add metrics desc for perf
#108
wangxingjun778
closed
3 weeks ago
0
未来是否有计划支持对 embedding/reranker 模型 性能/指标 的评估
#107
shell-nlp
opened
4 weeks ago
1
What if I set `enable=false` in `evalscope/registry/config/cfg_single.yaml`?
#106
zhimin-z
opened
4 weeks ago
1
调用OpenCompassBackendManager.list_datasets()错误
#105
lyc0930
opened
4 weeks ago
6
HallusionBench数据集的"aAcc","fAcc","qAcc"指标含义
#104
stay-leave
closed
2 weeks ago
1
能够描述一下 每个指标的含义,有几个指标不太懂什么意思
#103
shell-nlp
closed
2 weeks ago
2
update unit test, configs, docs
#102
Yunnglin
closed
1 month ago
0
perf 只发送一条请求
#101
shell-nlp
closed
4 weeks ago
1
Fix custom generation_config in arena mode and report gen
#100
wangxingjun778
closed
1 month ago
0
Add registry data
#99
wangxingjun778
closed
1 month ago
0
OpenCompass,VLMEvalKit 评测模型的时候如何指定请求参数?
#98
jackqdldd
opened
1 month ago
3
请问VLM的自定义评测集怎么做?
#97
stay-leave
opened
1 month ago
2
Where is the toolkit name?
#96
zhimin-z
closed
1 month ago
1
eval_swift_openai是否支持并发测试,怎么配置?
#95
charliedream1
opened
1 month ago
5
evalscope perf wandb
#94
zll0000
opened
1 month ago
2
evalscope perf --url 'our_url/v1/completions' --parallel 128 --model 'Qwen2-72B-Instruct' --log-every-n-query 10 --read-timeout=120 --dataset-path './data/open_qa.jsonl' -n 1 --max-prompt-length 128000 --api openai --stream --stop '<|im_end|>' --dataset openqa --debug
#93
zll0000
opened
1 month ago
7
infor_vqa,doc_vqa数据集在计算指标时出现没有answer的情况
#92
stay-leave
closed
1 month ago
2
Dev/refactor 0.5
#91
wangxingjun778
closed
1 month ago
0
Updata uts and logs
#90
wangxingjun778
closed
1 month ago
0
是否可以对在线 API 进行模型评估
#89
MatheMatrix
closed
1 month ago
5
Update readme
#88
wangxingjun778
closed
1 month ago
0
Change package name: from llmuses to evalscope
#87
wangxingjun778
closed
1 month ago
0
Update example for vlmeval
#86
wangxingjun778
closed
1 month ago
0
Refactor setup and add UTs
#85
wangxingjun778
closed
1 month ago
0
bugfix: update openai_model_api get_logits.
#84
Chen9154
closed
1 month ago
0
Release/0.2
#83
Chen9154
closed
1 month ago
0
Support VLM evaluation
#82
Yunnglin
closed
1 month ago
0
Next