antoyang / just-ask

[ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos
https://arxiv.org/abs/2012.00451
Apache License 2.0
117 stars 15 forks source link

Some results of the VQA-T model are not reproducible #12

Closed fake-warrior8 closed 1 year ago

fake-warrior8 commented 1 year ago

Hi, I ran the VQA-T model from scratch using the command

python main_videoqa.py --checkpoint_dir=ft<dataset> --dataset=<dataset> --lr=0.00001 \ 
--pretrain_path=<CKPT_PATH>

On MSRVTT-QA, MSVD-QA, Anet-QA, How2QA and iVQA, I got the following results: 40.2, 41.5, 33.8, 71.4 and 15.7, while the paper showed 39.6, 41.2, 36.8, 80.8, 23.0. Do the settings of Anet-QA, How2QA and iVQA use some different hyperparameters?

antoyang commented 1 year ago

Hi, all the hyperparameters I can remember of are provided in the paper. You may try neighboring learning rates to see if that helps. For reproducing the main results of the paper, you may use the pretrained checkpoints.

fake-warrior8 commented 1 year ago

Hi, all the hyperparameters I can remember of are provided in the paper. You may try neighboring learning rates to see if that helps. For reproducing the main results of the paper, you may use the pretrained checkpoints.

It works, thank you!