airsplay / lxmert

PyTorch code for EMNLP 2019 paper "LXMERT: Learning Cross-Modality Encoder Representations from Transformers".
MIT License
923 stars 157 forks source link

Unable to overfit on tiny dataset (pre-training) #81

Open hila-chefer opened 4 years ago

hila-chefer commented 4 years ago

Hi!

I followed the second tip here https://github.com/airsplay/lxmert/blob/master/experience_in_pretraining.md and tried to overfit on a small dataset (assumed that running run/lxmert_pretrain.bash with the --tiny flag would do the trick) and the model does not seem to overfit after finishing its 20 epochs. Is this by design? Could you please provide a log for the pre-training on the tiny sets?

Thanks!

airsplay commented 4 years ago

I think that you might need to increase the number of epochs for these tiny datasets.