showlab / all-in-one

[CVPR2023] All in One: Exploring Unified Video-Language Pre-training
https://arxiv.org/abs/2203.07303
277 stars 16 forks source link

About setting unfilterd=False when computing accuracy #8

Closed liyz15 closed 1 year ago

liyz15 commented 2 years ago

unfiltered is set to False in https://github.com/showlab/all-in-one/blob/a9c1d4576261ff01009f3fa7810cfaf58e32dc3b/AllInOne/modules/objectives.py#L759-L761

It looks like unseen classes are skipped rather than counted as wrong as mentioned in https://github.com/showlab/all-in-one/issues/7#issuecomment-1193501941

FingerRec commented 2 years ago

Hi, the unfitted operation almost have no effect in the final result.

liyz15 commented 2 years ago

Thanks for your reply, I've checked the test set of msrvttqa and found that among 72821 samples, 5051 of them are not in train label. If that's the case, accuracy 46.8% ignoring unseen classes would be 46.8%*(72821-5051)/72821=43.5% if including unseen classes. Is there anything I missed?

FingerRec commented 2 years ago

Hi,

Thanks a lot for your carefully checking :)

I remember previously uncomment line43 in msrvttqa.py, but not much sure about that . Could you please try to set unfilterd=True and run this code once more and show the result here?