open-compass / VLMEvalKit

Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks
https://huggingface.co/spaces/opencompass/open_vlm_leaderboard
Apache License 2.0
1.34k stars 188 forks source link

Deploy local LLM as judge llm on VLLM tools,but it didn't use it to evaluate #438

Closed BrenchCC closed 2 months ago

BrenchCC commented 2 months ago

vllm cli: CUDA_VISIBLE_DEVICES=6,7 python -m vllm.entrypoints.openai.api_server --served-model-name Qwen2-7B-Instruct --port 8001 --model ../models/Qwen2-7B-Instruct --tensor-parallel-size 2 --api-key sk-123456

for evluating mmstar,mathvista_mini datasets

.env settings as following:

LMUData=./data
OPENAI_API_KEY=sk-123456
OPENAI_API_BASE=http://0.0.0.0:8001/v1/chat/completions     
LOCAL_LLM=Qwen2-7B-Instruct
lzd19981105 commented 2 months ago

+1! Doploy internlm2-1.8B as judge using lmdeploy, but it didnot use when evaluate! OPENAI_API_KEY=sk-123456 OPENAI_API_BASE=http://0.0.0.0:23333/v1/chat/completions LOCAL_LLM=./VLM/ckpt/internlm2-1.8/