Open runzeer opened 4 years ago
Could you try this one that I used to submit the leaderboard entry?
BatchSize 64,
LR 1e-4,
Epochs 20 (Vizwiz is super small...
One epoch takes around 10 mins while VQA takes 1.5 hours,
we thus increase the number of epochs)
OK! I would try it soon! Thanks a lot! But I still have 2 questions for the training. Looking forward to your reply.
Thanks. I have uploaded the materials here: http://nlp.cs.unc.edu/data/lxmert_data/vizwiz/vizwiz.zip. You could kindly take a look.
For the loss function, I just used CrossEntropy as VQA/GQA.
Sorry to trouble you again.. When I use the materials above, there exists the KeyError: target[self.raw_dataset.ans2label[ans]] = score KeyError: '1 package stouffer signature classics fettuccini alfredo' But I do not find the solution because the key is in the dict. So could you help me find this?
I think that I just remove the answer if it is not in the dict.
OK!I found it! Thanks a lot!!
I checked the test file and found the test files have been changed. And I wanted to use your docker but the pretrained model link below is out-of-date. https://www.dropbox.com/s/nu6jwhc88ujbw1v/resnet101_faster_rcnn_final_iter_320000.caffemodel?dl=1
So could you use your model to generate the new test data? Thanks a lot!
The new dropbox link of the model is updated on bottom-up-attention repo and is available [here](alternative pretrained model).
OK! Thanks a lot!! I wonder how you change the answers to the labels, especially adding the label confidence.
Dear Pro: I read about the Vizwiz Leaderboard for ECCV 2018. The results shown are 55.40 for no model ensemble. But I trained the Vizwiz datasets and the results are only 51.96. So I want to know how the results different. The answer wocabulary for the Vizwiz dataset are chosen according to the most common 3000 categories. The initial lr rate is 5e-5, epochs are 4 and batchsize is 32. The pretraining model I used is the Epoch20_LXRT.pth. So if convenient, could you share your Hyper parameters for the Vizwiz datasets?