Closed e-bug closed 3 years ago
Hi! Could you share the size of pre-training data? I saw that you extend the training set with part of the validation set.
I think it was 100k images with 5 captions each. I used MSCOCO train+val data, which is conventional.
Hi! Could you share the size of pre-training data? I saw that you extend the training set with part of the validation set.