linjieli222 / HERO

Research code for EMNLP 2020 paper "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"
https://arxiv.org/abs/2005.00200
MIT License
230 stars 34 forks source link

Error while pretraining with TV data #33

Closed michaelmyc closed 3 years ago

michaelmyc commented 3 years ago

I think I found a bug when running the pretrain.py script. I downloaded the pretrain-tv dataset with bash scripts/download_tv_pretrain.sh /path/to/storage, and ran the pretrain.py script in docker. The only changes in the config file are smaller batch sizes. After 500 steps, the program does validation and runs into len() of a 0-d tensor when running validate_vsm.

The error is caused due to one of the input features has the shape torch.Size([1, 60, 768]), which caused loss_neg_ctx to be a scalar value, which makes len() complain. The simple fix that works for me is to add an inline if to detect if it's 0 dimensional and return 1 if it is. An alternative might be to force the scalar into a 1-d vector of 1 value, but I have not tested this solution.

Happy to open a PR if needed.

linjieli222 commented 3 years ago

@michaelmyc

Thanks for reporting and fixing the bug. Feel free to submit a PR.

michaelmyc commented 3 years ago

PR submitted. Closing issue. :smiley: