airsplay / lxmert

PyTorch code for EMNLP 2019 paper "LXMERT: Learning Cross-Modality Encoder Representations from Transformers".
MIT License
933 stars 158 forks source link

Can you provide a log file of lxmert pretraining #22

Open ParadoxZW opened 5 years ago

ParadoxZW commented 5 years ago

I am running the reproducing experiment of lxmert pretraining. I wish that you can provide a log file, so I can find problems at early epochs.

airsplay commented 5 years ago

Sure. Here is the log for the first 10 epochs. It is run with exactly this Github version but with slightly different hyperparameters (i.e., 20 epochs + BERT initialization + repeating VQA&MSCOCO to balance the amount of data). As a result, the mask_lm loss in this log might be lower in the first few epochs but the tendency should almost be the same.

I also observed that the obj_predict and attr_predict loss on the val set would go up (i.e., overfit). More discussions here.

The training loss for Epoch 1 is 7.3978
The losses are Mask_LM: 1.7000 Matched: 0.3337 Obj: 1.5459 Attr: 0.5252 Feat: 0.2981 QA: 2.9949 
Overall Accu 0.2104, gqa Accu 0.2569, visual7w Accu 0.1355, vqa Accu 0.3005, 
The valid loss is 5.3834
The losses are Mask_LM: 1.5037 Matched: 0.2868 Obj: 0.6320 Attr: 0.2603 Feat: 0.2448 QA: 2.4559 
Overall Accu 0.2418, gqa Accu 0.2821, visual7w Accu 0.1593, vqa Accu 0.3072, 
100%|████████████████████████████████████████████████████████████████| 46742/46742 [19:42:16<00:00,  1.52s/it]
The training loss for Epoch 2 is 4.8495
The losses are Mask_LM: 1.4634 Matched: 0.2475 Obj: 0.4201 Attr: 0.2156 Feat: 0.2419 QA: 2.2610 
Overall Accu 0.2501, gqa Accu 0.3072, visual7w Accu 0.1709, vqa Accu 0.3320, 
The valid loss is 5.0167
The losses are Mask_LM: 1.4236 Matched: 0.2662 Obj: 0.6116 Attr: 0.2488 Feat: 0.2316 QA: 2.2350 
Overall Accu 0.2680, gqa Accu 0.3215, visual7w Accu 0.1744, vqa Accu 0.3357, 
100%|████████████████████████████████████████████████████████████████| 46742/46742 [19:36:37<00:00,  1.51s/it]
The training loss for Epoch 3 is 4.4845
The losses are Mask_LM: 1.3955 Matched: 0.2310 Obj: 0.3448 Attr: 0.1872 Feat: 0.2336 QA: 2.0924 
Overall Accu 0.2659, gqa Accu 0.3314, visual7w Accu 0.1826, vqa Accu 0.3433, 
The valid loss is 4.8306
The losses are Mask_LM: 1.3758 Matched: 0.2535 Obj: 0.6184 Attr: 0.2410 Feat: 0.2254 QA: 2.1164 
Overall Accu 0.2723, gqa Accu 0.3235, visual7w Accu 0.1826, vqa Accu 0.3371, 
100%|████████████████████████████████████████████████████████████████| 46742/46742 [19:37:16<00:00,  1.51s/it]
The training loss for Epoch 4 is 4.3071
The losses are Mask_LM: 1.3568 Matched: 0.2220 Obj: 0.3090 Attr: 0.1727 Feat: 0.2286 QA: 2.0181 
Overall Accu 0.2721, gqa Accu 0.3397, visual7w Accu 0.1889, vqa Accu 0.3459, 
The valid loss is 4.7712
The losses are Mask_LM: 1.3555 Matched: 0.2502 Obj: 0.6144 Attr: 0.2392 Feat: 0.2209 QA: 2.0910 
Overall Accu 0.2703, gqa Accu 0.3283, visual7w Accu 0.1859, vqa Accu 0.3230, 
100%|████████████████████████████████████████████████████████████████| 46742/46742 [19:39:52<00:00,  1.51s/it]
The training loss for Epoch 5 is 4.1769
The losses are Mask_LM: 1.3261 Matched: 0.2149 Obj: 0.2839 Attr: 0.1621 Feat: 0.2252 QA: 1.9646 
Overall Accu 0.2761, gqa Accu 0.3448, visual7w Accu 0.1925, vqa Accu 0.3489, 
The valid loss is 4.7557
The losses are Mask_LM: 1.3381 Matched: 0.2457 Obj: 0.6339 Attr: 0.2433 Feat: 0.2185 QA: 2.0763 
Overall Accu 0.2757, gqa Accu 0.3340, visual7w Accu 0.1867, vqa Accu 0.3338, 
100%|████████████████████████████████████████████████████████████████| 46742/46742 [19:39:06<00:00,  1.51s/it]
The training loss for Epoch 6 is 4.0650
The losses are Mask_LM: 1.2986 Matched: 0.2087 Obj: 0.2629 Attr: 0.1532 Feat: 0.2226 QA: 1.9189 
Overall Accu 0.2865, gqa Accu 0.3597, visual7w Accu 0.1946, vqa Accu 0.3698, 
The valid loss is 4.6940
The losses are Mask_LM: 1.2990 Matched: 0.2422 Obj: 0.6410 Attr: 0.2432 Feat: 0.2166 QA: 2.0521 
Overall Accu 0.2896, gqa Accu 0.3504, visual7w Accu 0.1890, vqa Accu 0.3594, 
100%|████████████████████████████████████████████████████████████████| 46742/46742 [19:31:15<00:00,  1.50s/it]
The training loss for Epoch 7 is 3.9538
The losses are Mask_LM: 1.2719 Matched: 0.2038 Obj: 0.2450 Attr: 0.1455 Feat: 0.2205 QA: 1.8671 
Overall Accu 0.2957, gqa Accu 0.3744, visual7w Accu 0.1985, vqa Accu 0.3821, 
The valid loss is 4.6838
The losses are Mask_LM: 1.2941 Matched: 0.2368 Obj: 0.6598 Attr: 0.2422 Feat: 0.2148 QA: 2.0361 
Overall Accu 0.2903, gqa Accu 0.3524, visual7w Accu 0.1906, vqa Accu 0.3581, 
100%|████████████████████████████████████████████████████████████████| 46742/46742 [19:30:50<00:00,  1.50s/it]
The training loss for Epoch 8 is 3.8464
The losses are Mask_LM: 1.2512 Matched: 0.1990 Obj: 0.2291 Attr: 0.1385 Feat: 0.2188 QA: 1.8097 
Overall Accu 0.2987, gqa Accu 0.3780, visual7w Accu 0.2022, vqa Accu 0.3827, 
The valid loss is 4.6147
The losses are Mask_LM: 1.2709 Matched: 0.2323 Obj: 0.6559 Attr: 0.2426 Feat: 0.2127 QA: 2.0003 
Overall Accu 0.2912, gqa Accu 0.3571, visual7w Accu 0.1896, vqa Accu 0.3582, 
100%|████████████████████████████████████████████████████████████████| 46742/46742 [19:31:25<00:00,  1.50s/it]
The training loss for Epoch 9 is 3.7406
The losses are Mask_LM: 1.2282 Matched: 0.1949 Obj: 0.2152 Attr: 0.1324 Feat: 0.2173 QA: 1.7527 
Overall Accu 0.3021, gqa Accu 0.3818, visual7w Accu 0.2068, vqa Accu 0.3823, 
The valid loss is 4.5979
The losses are Mask_LM: 1.2550 Matched: 0.2317 Obj: 0.6618 Attr: 0.2406 Feat: 0.2114 QA: 1.9975 
Overall Accu 0.2882, gqa Accu 0.3482, visual7w Accu 0.1963, vqa Accu 0.3484, 
100%|████████████████████████████████████████████████████████████████| 46742/46742 [19:32:17<00:00,  1.50s/it]
The training loss for Epoch 10 is 3.6358
The losses are Mask_LM: 1.2092 Matched: 0.1915 Obj: 0.2028 Attr: 0.1267 Feat: 0.2159 QA: 1.6897 
Overall Accu 0.3050, gqa Accu 0.3845, visual7w Accu 0.2117, vqa Accu 0.3810, 
The valid loss is 4.6232
The losses are Mask_LM: 1.2474 Matched: 0.2315 Obj: 0.6770 Attr: 0.2461 Feat: 0.2104 QA: 2.0108 
Overall Accu 0.2840, gqa Accu 0.3445, visual7w Accu 0.1910, vqa Accu 0.3451,
j-min commented 5 years ago

Can you also upload the logs of 1) the last 10 epochs of the above training and 2) the new 12-epoch training as well?

airsplay commented 5 years ago

Sure. Logs for the last 10 epochs are attached; only epochs 11-17 are shown since the experiment is still running.

BTW, after 16-epochs of pre-training, the fine-tuning results (i.e., 69.86 on VQA and 74.49 on NLVR2) match the number with my released pre-training snapshot. ~I will release the 20 weights (for each epoch) and the full logs when it finishes.~ The full log was lost because of a clean of the server logs... The snaps of these 20 snapshots are available here: https://nlp.cs.unc.edu/data/github_pretrain/lxmert20/EpochXX_LXRT.pth, XX from 01 to 20. The fine-tuned results of the last epoch (i.e., XX=20) could reach the same pre-trained results.

The logs for the new 12-epoch pre-training is unfortunately lost... I forget to append | tee after the running commands. However, I have uploaded the weights after each epoch here: https://nlp.cs.unc.edu/data/github_pretrain/lxmert/EpochXX_LXRT.pth, XX from 01 to 12. The validation losses/accuracies could be calculated by calling lxmert_pretrain::LXMERT::evaluate_epoch.

The training loss for Epoch 11 is 3.5359
The losses are Mask_LM: 1.1908 Matched: 0.1880 Obj: 0.1911 Attr: 0.1214 Feat: 0.2147 QA: 1.6299 
Overall Accu 0.3092, gqa Accu 0.3889, visual7w Accu 0.2171, vqa Accu 0.3828, 
The valid loss is 4.5564
The losses are Mask_LM: 1.2261 Matched: 0.2304 Obj: 0.6772 Attr: 0.2478 Feat: 0.2092 QA: 1.9657 
Overall Accu 0.2876, gqa Accu 0.3433, visual7w Accu 0.1969, vqa Accu 0.3498, 
100%|████████████████████████████████████████████████████████████████| 46742/46742 [19:23:10<00:00,  1.49s/it]
The training loss for Epoch 12 is 3.4354
The losses are Mask_LM: 1.1721 Matched: 0.1845 Obj: 0.1801 Attr: 0.1162 Feat: 0.2135 QA: 1.5689 
Overall Accu 0.3148, gqa Accu 0.3965, visual7w Accu 0.2220, vqa Accu 0.3861, 
The valid loss is 4.5573
The losses are Mask_LM: 1.2168 Matched: 0.2303 Obj: 0.6866 Attr: 0.2459 Feat: 0.2084 QA: 1.9693 
Overall Accu 0.2912, gqa Accu 0.3531, visual7w Accu 0.1959, vqa Accu 0.3537, 
100%|████████████████████████████████████████████████████████████████| 46742/46742 [19:27:58<00:00,  1.50s/it]
The training loss for Epoch 13 is 3.3357
The losses are Mask_LM: 1.1552 Matched: 0.1816 Obj: 0.1696 Attr: 0.1113 Feat: 0.2124 QA: 1.5057 
Overall Accu 0.3227, gqa Accu 0.4049, visual7w Accu 0.2283, vqa Accu 0.3970, 
The valid loss is 4.5651
The losses are Mask_LM: 1.2140 Matched: 0.2301 Obj: 0.6980 Attr: 0.2469 Feat: 0.2072 QA: 1.9689 
Overall Accu 0.2956, gqa Accu 0.3603, visual7w Accu 0.1970, vqa Accu 0.3599, 
100%|████████████████████████████████████████████████████████████████| 46742/46742 [19:30:56<00:00,  1.50s/it]
The training loss for Epoch 14 is 3.2383
The losses are Mask_LM: 1.1386 Matched: 0.1784 Obj: 0.1598 Attr: 0.1065 Feat: 0.2114 QA: 1.4436 
Overall Accu 0.3296, gqa Accu 0.4136, visual7w Accu 0.2341, vqa Accu 0.4036, 
The valid loss is 4.5303
The losses are Mask_LM: 1.1917 Matched: 0.2248 Obj: 0.7044 Attr: 0.2507 Feat: 0.2065 QA: 1.9522 
Overall Accu 0.2913, gqa Accu 0.3502, visual7w Accu 0.2023, vqa Accu 0.3487, 
100%|████████████████████████████████████████████████████████████████| 46742/46742 [19:30:31<00:00,  1.50s/it]
The training loss for Epoch 15 is 3.1418
The losses are Mask_LM: 1.1240 Matched: 0.1756 Obj: 0.1500 Attr: 0.1018 Feat: 0.2104 QA: 1.3800 
Overall Accu 0.3384, gqa Accu 0.4245, visual7w Accu 0.2410, vqa Accu 0.4129, 
The valid loss is 4.5739
The losses are Mask_LM: 1.1880 Matched: 0.2245 Obj: 0.7120 Attr: 0.2539 Feat: 0.2058 QA: 1.9898 
Overall Accu 0.2931, gqa Accu 0.3530, visual7w Accu 0.1995, vqa Accu 0.3551, 
100%|████████████████████████████████████████████████████████████████| 46742/46742 [19:32:03<00:00,  1.50s/it]
The training loss for Epoch 16 is 3.0503
The losses are Mask_LM: 1.1083 Matched: 0.1727 Obj: 0.1410 Attr: 0.0973 Feat: 0.2095 QA: 1.3215 
Overall Accu 0.3466, gqa Accu 0.4330, visual7w Accu 0.2473, vqa Accu 0.4248, 
The valid loss is 4.5356
The losses are Mask_LM: 1.1832 Matched: 0.2250 Obj: 0.7143 Attr: 0.2524 Feat: 0.2045 QA: 1.9562 
Overall Accu 0.3006, gqa Accu 0.3605, visual7w Accu 0.1998, vqa Accu 0.3714, 
100%|████████████████████████████████████████████████████████████████| 46742/46742 [19:24:57<00:00,  1.50s/it]
The training loss for Epoch 17 is 2.9553
The losses are Mask_LM: 1.0930 Matched: 0.1697 Obj: 0.1323 Attr: 0.0929 Feat: 0.2087 QA: 1.2587 
Overall Accu 0.3558, gqa Accu 0.4454, visual7w Accu 0.2548, vqa Accu 0.4324, 
The valid loss is 4.5562
The losses are Mask_LM: 1.1740 Matched: 0.2241 Obj: 0.7294 Attr: 0.2575 Feat: 0.2045 QA: 1.9668 
Overall Accu 0.2980, gqa Accu 0.3561, visual7w Accu 0.2016, vqa Accu 0.3650, 
 94%|██████████████████████████████████████████████████████████▏   | 43899/46742 [18:11:36<1:11:55,  1.52s/it
j-min commented 5 years ago

Thanks!

ParadoxZW commented 4 years ago

The pretrained model you provided(at http://nlp.cs.unc.edu/data/model_LXRT.pth) was trained after 20 epoch or 12 epoch? Can the 12-epoch pretrained model achieve 79% accuracy on VQA dataset?

airsplay commented 4 years ago

This default model is trained with two stages: 10 epochs without QA loss and 10 epochs with QA loss. The current default 20-epochs single-stage pre-training with QA loss would reach 69.86% on VQA, almost the same to the two-stage approach.

The results of 12-epochs single-stage pre-training on VQA is around 69.5, slightly lower than 20-epochs.

By the way, I am not sure whether 12-epochs pre-training with higher learning rate / small batch size (thus more updates) could get higher number. The hyper-parameters of the pre-training are currently under-tuned.

kritiagg commented 4 years ago

Was anyone able to reproduce the results by pretraining from scratch. Athough I can see better accuracy numbers on all three testsets while pretraining but My results are still 1 point lower on VQA, GQA and 3.3 points lower in NLVR2 while finetuning.

violetteshev commented 4 years ago

Hi @kritiagg, I tried to pretrain from scratch (12 epochs single-stage) and my result on VQA is 68.5. Did you manage to reproduce the results eventually?