open-compass / VLMEvalKit

Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 40+ benchmarks
https://huggingface.co/spaces/opencompass/open_vlm_leaderboard
Apache License 2.0
1.08k stars 154 forks source link

MMAlaya2 Added #399

Closed bingwork closed 1 month ago

bingwork commented 1 month ago
  1. Added MMAlaya2 implementation based on vlmeval/vlm/internvl_chat.py.
  2. Checked code to download model weights from https://huggingface.co/DataCanvas/MMAlaya2 and infer mmbench_test_cn_20231003.tsv, achieving A_Overall (test) score of 0.8211883408071748.
  3. Added some packages in requirements.txt.

MMAlaya2 (LoRA-merging), with the company name still being DataCanvas, reuses other components from InternVL-Chat-V1.5. The model has 26 billion parameters, with the language model being InternLM2-20B and the vision model being InternViT-6B.