jnhwkim / cbp

Multimodal Compact Bilinear Pooling for Torch7
Other
68 stars 23 forks source link

Could you share your torch code with cbp for VQA #12

Closed cloud-waiting-for-wind closed 7 years ago

cloud-waiting-for-wind commented 7 years ago

hi, I come here because of your answer in the vqa-mcb. I want to use cbp layer in torch for VQA, but the result is very poor. I am not sure it is whether my Programming errors or not. Is there something need to pay attention to when we use cbp layer for VQA in torch? Could you share your torch code with cbp layer for VQA ?
thanks!

jnhwkim commented 7 years ago

Major issue with cbp layer is that we should fix both h and s after initialization for both training and test. For this, we can save cbp layers you used and load the saved layer and replace with that, or just save the whole model using torch.save(). For an instance, Here is the way I used. After one forwarding, h and s are initialized and I saved cbp layers using torch.save(). Notice that the current MCB-VQA in torch7 produces 62.11% (degraded around 2% compared with Caffe implementation). The difference with the original Caffe version may attribute to preprocessing, initialization of h and s, optimization, or etc. (This MCB-VQA is intentionally constrained other factors to compare with our model, MLB. It is possible that the hyperparameters opt to MLB.) Help me to improve this MCB-VQA code!