Closed gaopeng-eugene closed 7 years ago
Here is the command line I use.
python train.py --vqa_trainsplit train --path_opt options/vqa/mutan_att_train.yaml
To summarize the result, I am training on train set and evaluating on val set. MUTAN+Att is 53 MUTAN+No Att is 50
You're looking at the val accuracy not the open ended val accuracy. The latter can be obtained using eval_res.py
. This file is automatically executed after each training epoch : https://github.com/Cadene/vqa.pytorch/blob/master/train.py#L287
eval_res.py
generates the open ended accuracy in a json file in the exp directory (logs). The open ended accuracy can be viewed using plotly : https://github.com/Cadene/vqa.pytorch#monitor-training
Thank you so much for your quick reply. I will try your suggestion. Another question when I read your ICCV paper : In you paper, you compare with other method in No Attention and ensemble setting. Why not compare with Single Model Attention Setting?
A small question, what is the difference between val accuracy and open ended val accuracy? As far as I know, there is two measurement in VQA, open ended accuracy and MC?
Why not compare with Single Model Attention Setting?
It would have been a good idea, but we were really running out of time and place in the paper. So we focused on what we thought were the most important.
A small question, what is the difference between val accuracy and open ended val accuracy?
Look at the equation (13) in the paper: "If the predicted answer appears at least 3 times in the ground truth answers, the accuracy for this example is considered to be 1. Intuitively, this metrics takes into account the consensus between annotators."
As far as I know, there is two measurement in VQA, open ended accuracy and MC?
VQA OpenEnded and VQA MC are two different problems. MC stands for Multiple Choices (answers which are inputs).
Thank you so much for your reply.
Hi, thank you so much for your code. Right now, I am trying to replicate your ICCV results with the pytorch implementation. Here is the setting 'batch_size': None, 'dir_logs': None, 'epochs': None, 'evaluate': False, 'help_opt': False, 'learning_rate': None, 'path_opt': 'options/vqa/mutan_att_trainval.yaml', 'print_freq': 10, 'resume': '', 'save_all_from': None, 'save_model': True, 'st_dropout': None, 'st_fixed_emb': None, 'st_type': None, 'start_epoch': 0, 'vqa_trainsplit': 'train', 'workers': 16}
options
{'coco': {'arch': 'fbresnet152torch', 'dir': 'data/coco', 'mode': 'att'}, 'logs': {'dir_logs': 'logs/vqa/mutan_att_trainval'}, 'model': {'arch': 'MutanAtt', 'attention': {'R': 5, 'activation_q': 'tanh', 'activation_v': 'tanh', 'dim_hq': 310, 'dim_hv': 310, 'dim_mm': 510, 'dropout_hv': 0, 'dropout_mm': 0.5, 'dropout_q': 0.5, 'dropout_v': 0.5, 'nb_glimpses': 2}, 'classif': {'dropout': 0.5}, 'dim_q': 2400, 'dim_v': 2048, 'fusion': {'R': 5, 'activation_q': 'tanh', 'activation_v': 'tanh', 'dim_hq': 310, 'dim_hv': 620, 'dim_mm': 510, 'dropout_hq': 0, 'dropout_hv': 0, 'dropout_q': 0.5, 'dropout_v': 0.5}, 'seq2vec': {'arch': 'skipthoughts', 'dir_st': 'data/skip-thoughts', 'dropout': 0.25, 'fixed_emb': False, 'type': 'BayesianUniSkip'}}, 'optim': {'batch_size': 128, 'epochs': 100, 'lr': 0.0001}, 'vqa': {'dataset': 'VQA', 'dir': 'data/vqa', 'maxlength': 26, 'minwcount': 0, 'nans': 2000, 'nlp': 'mcb', 'pad': 'right', 'samplingans': True, 'trainsplit': 'train'}} Warning: 399/930911 words are not in dictionary, thus set UNK Warning fusion.py: no visual embedding before fusion Warning fusion.py: no question embedding before fusion Warning fusion.py: no visual embedding before fusion Warning fusion.py: no question embedding before fusion Model has 37840812 parameters
Here is the result after 100 epoch Epoch: [99][1740/1760] Time 0.403 (0.412) Data 0.000 (0.007) Loss 0.8993 (0.9064) Acc@1 71.094 (73.912) Acc@5 94.531 (94.830) Epoch: [99][1750/1760] Time 0.387 (0.412) Data 0.000 (0.007) Loss 0.8277 (0.9061) Acc@1 71.875 (73.915) Acc@5 95.312 (94.833) Val: [900/950] Time 0.138 (0.188) Loss 3.1201 (2.8397) Acc@1 49.219 (52.236) Acc@5 75.000 (78.115) Val: [910/950] Time 0.189 (0.187) Loss 2.4805 (2.8372) Acc@1 58.594 (52.240) Acc@5 80.469 (78.139) Val: [920/950] Time 0.210 (0.187) Loss 2.8639 (2.8388) Acc@1 53.125 (52.226) Acc@5 77.344 (78.137) Val: [930/950] Time 0.179 (0.187) Loss 2.1427 (2.8388) Acc@1 59.375 (52.227) Acc@5 82.031 (78.137) Val: [940/950] Time 0.151 (0.187) Loss 3.1772 (2.8367) Acc@1 50.781 (52.263) Acc@5 72.656 (78.163)