open-compass / VLMEvalKit

Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 40+ benchmarks
https://huggingface.co/spaces/opencompass/open_vlm_leaderboard
Apache License 2.0
1.08k stars 154 forks source link

slime #450

Closed yfzhang114 closed 2 weeks ago

yfzhang114 commented 3 weeks ago

Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models, a high-resolution MLLM.

The SliME strategy demonstrates exceptional versatility, extending seamlessly to video analysis (See Slime_video.md). Remarkably, even though the model has never been specifically trained on video data, it is capable of processing up to 8 frames. In the Video-MME benchmark, the model surpasses numerous 7B/8B baselines that have undergone training on video datasets.

PhoenixZ810 commented 3 weeks ago

Hi, Please fix the format problem in

flake8...................................................................Failed
- hook id: flake8
- exit code: 1

vlmeval/vlm/slime.py:11:1: E302 expected 2 blank lines, found 1
vlmeval/vlm/__init__.py:44:25: W292 no newline at end of file

Thank you!

yfzhang114 commented 2 weeks ago

Hi, Please fix the format problem in

flake8...................................................................Failed
- hook id: flake8
- exit code: 1

vlmeval/vlm/slime.py:11:1: E302 expected 2 blank lines, found 1
vlmeval/vlm/__init__.py:44:25: W292 no newline at end of file

Thank you!

the error has been fixed