JonnyS1226 / ego4d_asl

code for Ego4D Workshop@CVPR 2023 - 1st in MQ & 2nd in NLQ challenge
9 stars 2 forks source link

Reproducing the MQ results #2

Closed ardarslan closed 12 months ago

ardarslan commented 1 year ago

Hello,

In your paper, you mentioned that your final submission for the MQ Task on eval.ai is produced by ensembling predictions from three models. Could you please provide the config files for these three models? I am also confused about the combination of the features I should use. Is it EgoVLP+InternVideo or EgoVLP+InternVideo+SlowFast+Omnivore?

Edit: When I followed the instructions under "Train on MQ (Train + val)" and "Submission (to Leaderboard test-set server)" in the README file, I got the following scores on eval.ai for the test split:

{"Recall@1x,tIoU=0.5": 43.15983417779825, "average_mAP": 26.092527253173575}

ardarslan commented 1 year ago

Hi, I need to reproduce your results to continue my thesis, I would really appreciate any reply. Thank you!

JonnyS1226 commented 1 year ago

Hi, sorry for the late reply. I was just busy with something else. Finally, we use EgoVLP+InternVideo+SlowFast+Omnivore features.
And I have three comments:

  1. Ideally, only using one model will achieve 27-28 average mAP and 46-47 recall1x@0.5 on val set. Have you reproduced similar results on val set?
  2. Training on trainset, our best results on val set occurs around 11-th epoch (if 15 epoch totally). So that training on train+val, we also choose 11-th or 12-th epoch checkpoint as the best one.
  3. We ensemble models (i.e., ensemble 11-th or 12-th ckpt trained by different configs.). The differences between these configs are just about "learning rate", "embed_dim", "regression range" or just seed, e.t.c. I will upload tomorrow together with "infer_ensemble.py"

Hope it helps.

ardarslan commented 1 year ago

Hi, thank you for your reply, and no problem! The scores I reported above were from using the checkpoint from the 15th epoch. I also tried using the 11th and the 12th epochs.

15th: {"Recall@1x,tIoU=0.5": 43.15983417779825, "average_mAP": 26.092527253173575} 11th: {"Recall@1x,tIoU=0.5": 45.50898203592814, "average_mAP": 27.106595524719427} 12th {"Recall@1x,tIoU=0.5": 44.58774758175956, "average_mAP": 26.669235981716433}

I used all the features in these three submissions. I would be very glad if you could share the other configs and infer_ensemble.py.

ardarslan commented 12 months ago

Hi again, I was wondering if there is any update on this issue. Thanks!

JonnyS1226 commented 12 months ago

Sorry for the late reply. I have updated it. You can use infer_ensemble.py to do ensembling. The expected ensembling average_mAP is around 29.

ardarslan commented 12 months ago

No problem, thanks a lot for all your help!