Evaluation of custom models and datasets.

open-compass / VLMEvalKit

Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 50+ HF models, 20+ benchmarks

https://huggingface.co/spaces/opencompass/open_vlm_leaderboard

Apache License 2.0

646 stars 75 forks source link

Evaluation of custom models and datasets. #108

Open juxingyiwan opened 3 months ago

juxingyiwan commented 3 months ago

VLMEVALKIT is a pretty convenient evaluation tool for MLLMs. I hope that the esteemed authors can create a framework for VLMEVALKIT that supports the evaluation of custom models and custom datasets. This framework can define a unified MLLM input-output interface and the conversion format for datasets.

kennymckormick commented 3 months ago

Hi, @juxingyiwan , we are going to first implement for custom multiple-choice datasets, and will create a tutorial for it. Stay Tuned!