JUNJIE99 / MLVU

🔥🔥MLVU: Multi-task Long Video Understanding Benchmark
156 stars 0 forks source link

Leaderboard mismatch #5

Closed ssantos97 closed 1 month ago

ssantos97 commented 1 month ago

Hi,

Why are the results in the mini-leaderboard and full leaderboard as well as paper table different for the same methods?

JUNJIE99 commented 1 month ago

Hello, the mini-leaderboard on the GitHub page differs from the full leaderboard because the mini-leaderboard reports results on the MLVU-dev set, while the full leaderboard reports results on the MLVU-Test set. In our current paper version, we only report results on the MLVU-dev set.