BradyFU / Video-MME

✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
367 stars 11 forks source link

Validation set #27

Closed sanjayss34 closed 1 week ago

sanjayss34 commented 1 week ago

Hello, do you provide a validation/development set (separate from the test set)? This is so that we can evaluate our model as we are developing it, since tuning model design/hyperparameters according to the test set is undesirable.

BradyFU commented 1 week ago

Hi, thanks for your suggestion! There is no validation set yet. We hope that Video-MME can be directly used to evaluate the generalization ability.