Closed zxzhou9 closed 1 year ago
@zxzhou9 Can you please confirm the input feature size for MSCOCO features? For MSCOCO, we extracted resnet101 features of same size and use those to pretrain our model.
@aurooj I got that, I changed two files. 1. modeling_capsbert.py I changed hw from 7 to 6. 2. lxmert_pretrain.py -> class LXMERT: ->forward I changed h from 7 to 6> While, if I change 6 back to 7, there would be an error "all inputs arrays must have the same shape". I wonder if this error was raised by my .tsv(.hdf5) file which was downloaded by the instruction of Hao Tan, could you please upload your train2014_obj36.tsv or other format for reference.
The previous two-stage training mscoco provided a feature dimension of [36,2048], and the data feature dimension downloaded from the gqa link you pointed to was [7,7,2048], which is actually [49,2048]. As a result, if you use mscoco's training model to do this gqa fine-tuning, it will not match up in the data input dimension