Closed antoyang closed 4 years ago
Hi,
Thank you for your interest in my work and sorry for the issue you got. My guess is that it is due to the gradients exploding. I could not reproduce the issue on my PC so I'm not sure if the following solution helps. Can you please try to set the max_norm in gradient clipping (nn.utils.clip_gradnorm in train.py) to a lower value, say 8 or even lower? Also, did you use my extracted features or you did it yourself?
I tried to reduce the gradient clipping (up to 3) but it keeps on happening. I do use your extracted visual features and processed the linguistic features as indicated.
Hi,
Thank you for reporting the issue. It looks like I uploaded the wrong files that cause the problem. I've just re-uploaded new files. Can you please download the new files and try again?
Thank you!
Thanks for the quick fix, it is now indeed working :)
Hi, I have another question. And what is the accuracy of the TGIF-QA dataset you have trained on your machine? The accuracy of the training on my machine is very different from the paper.
Hi,
The accuracy is pretty much the same as those reported in the paper. There are some small differences (some better, some worse) between my public code and local code but they should not be too large.
I would recommend you to extract visual features yourself using commands in the README to reproduce performance reported in the paper. If you find any issue, please feel free to open an issue on github.
Thanks.
Okay, Thanks.
Hi,
I downloaded the code as well as provided extracted features and run all experiments on TGIF-QA again. Here are the results that I got: Action: 75.8 ; Transition: 82.1; Count: 3.83; FrameQA: 55.8
Cheers.
Hi,
Thank you for your answer, because I tested three times and the results are different. This should be my mistake, I will re-download all files and re-train them.
Hi,
Thanks for your great work. I have no problem using the code for MSVD-QA / MSRVTT-QA / the 3 other tasks of TGIF-QA, but as I train on TGIF-QA for FrameQA subtask, the loss quickly becomes nan (after about 80% of the first epoch), and the accuracy is 0. Do you have an idea why it happens?