EvolvingLMMs-Lab / lmms-eval

Accelerating the development of large multimodal models (LMMs) with lmms-eval
https://lmms-lab.github.io/
Other
1.02k stars 52 forks source link

Fix the potential risk by PR #117 #118

Closed teowu closed 1 week ago

teowu commented 1 week ago

In the previous commit, I have commented the L408 (stopping_criteria for generation) in https://github.com/EvolvingLMMs-Lab/lmms-eval/blob/main/lmms_eval/models/llava_vid.py#L408C22-L408C39 to fit this model with transformers >4.40.2. (The stopping_criteria still works with transformers==4.40.0)

Though this does not affect MCQ accuracy, no stopping criteria may potentially lead to risk on open-ended answers. Henceforth, a more responsible way is to remove this comment here.

I also include the lmms_eval/api/task.py to allow extracting videos from tars (which will make automatic evaluation of LongVideoBench really works from a brand new machine).