open-compass / opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
https://opencompass.org.cn/
Apache License 2.0
4.22k stars 449 forks source link

[Feature] 请问使用API评测如何支持自定义数据集? #1718

Open Jimmy-L99 opened 4 days ago

Jimmy-L99 commented 4 days ago

Describe the feature

command:

(opencompass) root@node1:~/user/OpenCompass/opencompass# python run.py configs/vllm-glm4-9b-chat-custom.py

vllm-glm4-9b-chat-custom.py


from opencompass.models import OpenAISDK
from mmengine.config import read_base

with read_base(): from .datasets.cmmlu.cmmlu_gen import cmmlu_datasets

datasets = [] datasets = cmmlu_datasets

api_meta_template = dict( round=[ dict(role='HUMAN', api_role='HUMAN'), dict(role='BOT', api_role='BOT', generate=True), ], reserved_roles=[dict(role='SYSTEM', api_role='SYSTEM')], )

models = [ dict( abbr='glm-4-9b-chat-vllm-API', type=OpenAISDK, key='EMPTY', openai_api_base='http://localhost:port/v1', path='glm-4-9b-chat', tokenizer_path='/root/user/models/glm-4-9b-chat', rpm_verbose=True, meta_template=api_meta_template, query_per_second=10, max_out_len=1024, max_seq_len=4096, temperature=0.01, batch_size=8, retry=3, ) ]

想在不更改OpenCompass代码的情况下实现,但是文档示例只有以下用法:

python run.py \ --models hf_llama2_7b \ --custom-dataset-path xxx/test_qa.jsonl \ --custom-dataset-data-type qa \ --custom-dataset-infer-method gen



### Will you implement it?

- [ ] I would like to implement this feature and create a PR!