facebookresearch / mmf

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
https://mmf.sh/
Other
5.49k stars 935 forks source link

Chinese image caption, In the result, multiple words of the same type appear #226

Closed cylvzj closed 4 years ago

cylvzj commented 4 years ago

❓ Questions and Help

Hello, I am using the COCO dataset, in butd models。 Extracting words with jieba I used all the words in the picture description that occurred more than 3 times as a dictionary file, and a total of 14,226 words. words = [w for w in word_freq.keys () if word_freq [w]> 3]

After training the model, when using it, multiple words of the same type appear in the result, such as:

Note notebook laptop computer on bed A little girl little girl girl standing together

How can I solve this problem? Should we reduce the weight of the concept reward? Asking for help

logs: 2020-02-20T00:47:10 INFO: coco:, 49700/50000, train/total_loss: 1.8166 (1.9215), train/caption_cross_entropy: 1.8166 (1.9215), train/caption_bleu4: 0.3368 (0.3277), val/total_loss: 3.2571, val/caption_cross_entropy: 3.2571, val/caption_bleu4: 0.5815, max mem: 14701.0, lr: 0., time: 04m 55s 025ms, eta: 15m 01s 892ms 2020-02-20T00:52:06 INFO: coco:, 49800/50000, train/total_loss: 1.8229 (1.9213), train/caption_cross_entropy: 1.8229 (1.9213), train/caption_bleu4: 0.3342 (0.3278), val/total_loss: 3.2468, val/caption_cross_entropy: 3.2468, val/caption_bleu4: 0.5905, max mem: 14701.0, lr: 0., time: 04m 56s 468ms, eta: 10m 04s 203ms 2020-02-20T00:57:07 INFO: coco:, 49900/50000, train/total_loss: 1.8390 (1.9212), train/caption_cross_entropy: 1.8390 (1.9212), train/caption_bleu4: 0.3342 (0.3278), val/total_loss: 3.3309, val/caption_cross_entropy: 3.3309, val/caption_bleu4: 0.5878, max mem: 14701.0, lr: 0., time: 05m 457ms, eta: 05m 06s 166ms 2020-02-20T01:02:07 INFO: coco:, 50000/50000, train/total_loss: 1.8454 (1.9210), train/caption_cross_entropy: 1.8454 (1.9210), train/caption_bleu4: 0.3346 (0.3278), val/total_loss: 3.3318, val/caption_cross_entropy: 3.3318, val/caption_bleu4: 0.5638, max mem: 14701.0, lr: 0., time: 04m 59s 208ms, eta: 2020-02-20T01:02:07 INFO: Evaluation time. Running on full validation set... 2020-02-20T01:02:32 INFO: coco: full val:, 50000/50000, val/total_loss: 3.3134, val/caption_cross_entropy: 3.3134, val/caption_bleu4: 0.5726, validation time: 50m 24s 331ms, best iteration: 18000, best val/caption_bleu4: 0.593812 2020-02-20T01:02:33 INFO: Stepping into final validation check 2020-02-20T01:02:33 INFO: Evaluation time. Running on full validation set... 2020-02-20T01:02:59 INFO: coco: full val:, 50001/50000, val/total_loss: 3.3128, val/caption_cross_entropy: 3.3128, val/caption_bleu4: 0.5726, validation time: 26s 637ms, best iteration: 18000, best val/caption_bleu4: 0.593812

apsdehal commented 4 years ago

This problem generally appears initially in training but usually with enough data should go away as you train more. If the problem persists, a simple solution is to set softmax weights of the already predicted word to -inf.

cylvzj commented 4 years ago

Thank you for reply. problems still exist, The logs are as follows:

nohup: ignoring input Logging to: ./save/captioning_coco_butd/logs/captioning_coco_butd_2020-03-05T23:43:09.log 2020-03-05T23:43:09 INFO: ===== Training Parameters ===== 2020-03-05T23:43:09 INFO: { "batch_size": 256, "clip_gradients": true, "clip_norm_mode": "all", "data_parallel": false, "device": "cuda", "distributed": false, "evalai_inference": false, "experiment_name": "run", "load_pretrained": false, "local_rank": null, "log_dir": "./logs", "log_interval": 100, "logger_level": "info", "lr_ratio": 0.1, "lr_scheduler": true, "lr_steps": [ 15000, 25000, 35000, 45000 ], "max_epochs": null, "max_grad_l2_norm": 0.25, "max_iterations": 50000, "metric_minimize": false, "monitored_metric": "caption_bleu4", "num_workers": 7, "patience": 4000, "pin_memory": false, "pretrained_mapping": {}, "resume": null, "resume_file": "./save/captioning_coco_butd/best.ckpt", "run_type": "train+inference", "save_dir": "./save", "seed": null, "should_early_stop": false, "should_not_log": false, "snapshot_interval": 1000, "task_size_proportional_sampling": true, "trainer": "base_trainer", "use_warmup": true, "verbose_dump": false, "warmup_factor": 0.2, "warmup_iterations": 1000 } 2020-03-05T23:43:09 INFO: ====== Task Attributes ====== 2020-03-05T23:43:09 INFO: ======== captioning/coco ======= 2020-03-05T23:43:09 INFO: { "data_root_dir": "../data", "fast_read": false, "features_max_len": 100, "image_depth_first": false, "image_features": { "test": [ "coco/detectron_fix_100/fc6/train_val_2014" ], "train": [ "coco/detectron_fix_100/fc6/train_val_2014" ], "val": [ "coco/detectron_fix_100/fc6/train_val_2014" ] }, "imdb_files": { "test": [ "imdb/coco_captions/imdb_karpathy_test.npy" ], "train": [ "imdb/coco_captions/imdb_karpathy_train.npy" ], "val": [ "imdb/coco_captions/imdb_karpathy_val.npy" ] }, "min_captions_per_img": 5, "processors": { "caption_processor": { "params": { "vocab": { "embedding_name": "glove.6B.300d", "type": "intersected", "vocab_file": "vocabs/vocabulary_captioning_thresh5.txt" } }, "type": "caption" }, "text_processor": { "params": { "max_length": 52, "preprocessor": { "params": {}, "type": "simple_sentence" }, "vocab": { "embedding_name": "glove.6B.300d", "type": "intersected", "vocab_file": "vocabs/vocabulary_captioning_thresh5.txt" } }, "type": "vocab" } }, "return_info": false, "use_ocr": false, "use_ocr_info": false } 2020-03-05T23:43:09 INFO: ====== Optimizer Attributes ====== 2020-03-05T23:43:09 INFO: { "params": { "eps": 1e-08, "lr": 0.01, "weight_decay": 0 }, "type": "Adamax" } 2020-03-05T23:43:09 INFO: ====== Model (butd) Attributes ====== 2020-03-05T23:43:09 INFO: { "classifier": { "params": { "dropout": 0.5, "fc_bias_init": 0, "feature_dim": 2048, "hidden_dim": 1024 }, "type": "language_decoder" }, "embedding_dim": 300, "image_feature_dim": 2048, "image_feature_embeddings": [ { "modal_combine": { "params": { "attention_dim": 1024, "dropout": 0.5, "hidden_dim": 1024 }, "type": "top_down_attention_lstm" }, "normalization": "softmax", "transform": { "params": { "out_dim": 1 }, "type": "linear" } } ], "image_feature_encodings": [ { "params": { "bias_file": "detectron/fc6/fc7_b.pkl", "weights_file": "detectron/fc6/fc7_w.pkl" }, "type": "finetune_faster_rcnn_fpn_fc7" } ], "inference": { "type": "greedy" }, "losses": [ { "type": "caption_cross_entropy" } ], "metrics": [ { "type": "caption_bleu4" } ], "model_data_dir": "../data/" } 2020-03-05T23:43:09 INFO: Loading tasks and data 2020-03-05T23:43:20 INFO: CUDA Device 0 is: Tesla P40 2020-03-05T23:43:30 INFO: Torch version is: 1.1.0 2020-03-05T23:43:30 INFO: Loading checkpoint 2020-03-05T23:43:31 INFO: Checkpoint loaded 2020-03-05T23:43:31 INFO: ===== Model ===== 2020-03-05T23:43:31 INFO: BUTD( (word_embedding): Embedding(11010, 300) (image_feature_encoders): ModuleList( (0): ImageEncoder( (module): FinetuneFasterRcnnFpnFc7( (lc): Linear(in_features=2048, out_features=2048, bias=True) ) ) ) (image_feature_embeddings_list): ModuleList( (0): ModuleList( (0): ImageEmbedding( (image_attention_model): AttentionLayer( (module): TopDownAttention( (combination_layer): ModalCombineLayer( (module): TopDownAttentionLSTM( (fa_image): Linear(in_features=2048, out_features=1024, bias=True) (fa_hidden): Linear(in_features=1024, out_features=1024, bias=True) (top_down_lstm): LSTMCell(3372, 1024) (relu): ReLU() (dropout): Dropout(p=0.5) ) ) (transform): TransformLayer( (module): LinearTransform( (lc): Linear(in_features=1024, out_features=1, bias=True) ) ) ) ) ) ) ) (classifier): ClassifierLayer( (module): LanguageDecoder( (language_lstm): LSTMCell(3072, 1024) (fc): Linear(in_features=1024, out_features=11010, bias=True) (dropout): Dropout(p=0.5) ) ) (losses): Losses() ) 2020-03-05T23:43:31 INFO: Starting training... 2020-03-05T23:48:09 INFO: coco:, 24100/50000, train/total_loss: 2.0885 (2.0849), train/caption_cross_entropy: 2.0885 (2.0849), train/caption_bleu4: 0.2596 (0.2642), val/total_loss: 2.4352, val/caption_cross_entropy: 2.4352, val/caption_bleu4: 0.2125, max mem: 12422.0, lr: 0.001, time: 04m 33s 884ms, eta: 20h 04m 43s 769ms 2020-03-05T23:52:38 INFO: coco:, 24200/50000, train/total_loss: 2.0759 (2.0808), train/caption_cross_entropy: 2.0759 (2.0808), train/caption_bleu4: 0.2592 (0.2634), val/total_loss: 2.3085, val/caption_cross_entropy: 2.3085, val/caption_bleu4: 0.2319, max mem: 12539.0, lr: 0.001, time: 04m 27s 367ms, eta: 19h 31m 31s 579ms 2020-03-05T23:57:12 INFO: coco:, 24300/50000, train/total_loss: 2.0651 (2.0779), train/caption_cross_entropy: 2.0651 (2.0779), train/caption_bleu4: 0.2620 (0.2636), val/total_loss: 2.2955, val/caption_cross_entropy: 2.2955, val/caption_bleu4: 0.2307, max mem: 12539.0, lr: 0.001, time: 04m 29s 797ms, eta: 19h 37m 35s 425ms 2020-03-06T00:01:44 INFO: coco:, 24400/50000, train/total_loss: 2.0647 (2.0761), train/caption_cross_entropy: 2.0647 (2.0761), train/caption_bleu4: 0.2607 (0.2636), val/total_loss: 2.3735, val/caption_cross_entropy: 2.3735, val/caption_bleu4: 0.1807, max mem: 12539.0, lr: 0.001, time: 04m 34s 159ms, eta: 19h 51m 58s 239ms 2020-03-06T00:06:21 INFO: coco:, 24500/50000, train/total_loss: 2.0833 (2.0774), train/caption_cross_entropy: 2.0833 (2.0774), train/caption_bleu4: 0.2611 (0.2636), val/total_loss: 2.2738, val/caption_cross_entropy: 2.2738, val/caption_bleu4: 0.2530, max mem: 12539.0, lr: 0.001, time: 04m 32s 347ms, eta: 19h 39m 28s 082ms 2020-03-06T00:10:54 INFO: coco:, 24600/50000, train/total_loss: 2.0526 (2.0782), train/caption_cross_entropy: 2.0526 (2.0782), train/caption_bleu4: 0.2669 (0.2637), val/total_loss: 2.4184, val/caption_cross_entropy: 2.4184, val/caption_bleu4: 0.2201, max mem: 12539.0, lr: 0.001, time: 04m 34s 793ms, eta: 19h 45m 23s 773ms 2020-03-06T00:15:30 INFO: coco:, 24700/50000, train/total_loss: 2.0822 (2.0778), train/caption_cross_entropy: 2.0822 (2.0778), train/caption_bleu4: 0.2639 (0.2634), val/total_loss: 2.2936, val/caption_cross_entropy: 2.2936, val/caption_bleu4: 0.2359, max mem: 12539.0, lr: 0.001, time: 04m 36s 422ms, eta: 19h 47m 43s 644ms 2020-03-06T00:20:12 INFO: coco:, 24800/50000, train/total_loss: 2.0814 (2.0786), train/caption_cross_entropy: 2.0814 (2.0786), train/caption_bleu4: 0.2632 (0.2634), val/total_loss: 2.2328, val/caption_cross_entropy: 2.2328, val/caption_bleu4: 0.2358, max mem: 12554.0, lr: 0.001, time: 04m 40s 851ms, eta: 20h 01m 59s 213ms 2020-03-06T00:24:50 INFO: coco:, 24900/50000, train/total_loss: 2.0744 (2.0790), train/caption_cross_entropy: 2.0744 (2.0790), train/caption_bleu4: 0.2594 (0.2632), val/total_loss: 2.4092, val/caption_cross_entropy: 2.4092, val/caption_bleu4: 0.2172, max mem: 12554.0, lr: 0.001, time: 04m 39s 071ms, eta: 19h 49m 37s 942ms 2020-03-06T00:29:30 INFO: coco:, 25000/50000, train/total_loss: 2.1028 (2.0802), train/caption_cross_entropy: 2.1028 (2.0802), train/caption_bleu4: 0.2631 (0.2631), val/total_loss: 2.3125, val/caption_cross_entropy: 2.3125, val/caption_bleu4: 0.2358, max mem: 12554.0, lr: 0.0001, time: 04m 38s 956ms, eta: 19h 44m 24s 166ms 2020-03-06T00:29:30 INFO: Evaluation time. Running on full validation set... 2020-03-06T00:30:02 INFO: coco: full val:, 25000/50000, val/total_loss: 2.3102, val/caption_cross_entropy: 2.3102, val/caption_bleu4: 0.2295, validation time: 46m 22s 713ms, best iteration: 24000, best val/caption_bleu4: 0.231642 2020-03-06T00:34:30 INFO: coco:, 25100/50000, train/total_loss: 2.0830 (2.0804), train/caption_cross_entropy: 2.0830 (2.0804), train/caption_bleu4: 0.2648 (0.2630), val/total_loss: 2.3814, val/caption_cross_entropy: 2.3814, val/caption_bleu4: 0.2246, max mem: 12554.0, lr: 0.0001, time: 05m 06s 735ms, eta: 21h 37m 08s 297ms 2020-03-06T00:39:02 INFO: coco:, 25200/50000, train/total_loss: 2.0779 (2.0807), train/caption_cross_entropy: 2.0779 (2.0807), train/caption_bleu4: 0.2603 (0.2631), val/total_loss: 2.2814, val/caption_cross_entropy: 2.2814, val/caption_bleu4: 0.2179, max mem: 12554.0, lr: 0.0001, time: 04m 31s 074ms, eta: 19h 01m 43s 847ms 2020-03-06T00:43:47 INFO: coco:, 25300/50000, train/total_loss: 2.0531 (2.0808), train/caption_cross_entropy: 2.0531 (2.0808), train/caption_bleu4: 0.2634 (0.2631), val/total_loss: 2.3531, val/caption_cross_entropy: 2.3531, val/caption_bleu4: 0.2391, max mem: 12591.0, lr: 0.0001, time: 04m 39s 416ms, eta: 19h 32m 07s 304ms 2020-03-06T00:48:24 INFO: coco:, 25400/50000, train/total_loss: 2.0908 (2.0807), train/caption_cross_entropy: 2.0908 (2.0807), train/caption_bleu4: 0.2633 (0.2632), val/total_loss: 2.4159, val/caption_cross_entropy: 2.4159, val/caption_bleu4: 0.2077, max mem: 12591.0, lr: 0.0001, time: 04m 38s 348ms, eta: 19h 22m 54s 642ms 2020-03-06T00:52:59 INFO: coco:, 25500/50000, train/total_loss: 2.0947 (2.0807), train/caption_cross_entropy: 2.0947 (2.0807), train/caption_bleu4: 0.2608 (0.2633), val/total_loss: 2.2805, val/caption_cross_entropy: 2.2805, val/caption_bleu4: 0.2434, max mem: 12591.0, lr: 0.0001, time: 04m 34s 260ms, eta: 19h 01m 10s 572ms 2020-03-06T00:57:33 INFO: coco:, 25600/50000, train/total_loss: 2.0676 (2.0808), train/caption_cross_entropy: 2.0676 (2.0808), train/caption_bleu4: 0.2615 (0.2634), val/total_loss: 2.2104, val/caption_cross_entropy: 2.2104, val/caption_bleu4: 0.2552, max mem: 12591.0, lr: 0.0001, time: 04m 32s 343ms, eta: 18h 48m 34s 466ms 2020-03-06T01:02:18 INFO: coco:, 25700/50000, train/total_loss: 2.0701 (2.0803), train/caption_cross_entropy: 2.0701 (2.0803), train/caption_bleu4: 0.2682 (0.2635), val/total_loss: 2.3795, val/caption_cross_entropy: 2.3795, val/caption_bleu4: 0.2232, max mem: 12591.0, lr: 0.0001, time: 04m 48s 711ms, eta: 19h 51m 29s 961ms 2020-03-06T01:06:52 INFO: coco:, 25800/50000, train/total_loss: 2.0457 (2.0794), train/caption_cross_entropy: 2.0457 (2.0794), train/caption_bleu4: 0.2646 (0.2637), val/total_loss: 2.3254, val/caption_cross_entropy: 2.3254, val/caption_bleu4: 0.2136, max mem: 12591.0, lr: 0.0001, time: 04m 33s 800ms, eta: 18h 45m 18s 542ms 2020-03-06T01:11:31 INFO: coco:, 25900/50000, train/total_loss: 2.0568 (2.0779), train/caption_cross_entropy: 2.0568 (2.0779), train/caption_bleu4: 0.2687 (0.2639), val/total_loss: 2.3195, val/caption_cross_entropy: 2.3195, val/caption_bleu4: 0.2034, max mem: 12591.0, lr: 0.0001, time: 04m 36s 966ms, eta: 18h 53m 37s 191ms 2020-03-06T01:15:55 INFO: coco:, 26000/50000, train/total_loss: 2.0770 (2.0773), train/caption_cross_entropy: 2.0770 (2.0773), train/caption_bleu4: 0.2607 (0.2639), val/total_loss: 2.3670, val/caption_cross_entropy: 2.3670, val/caption_bleu4: 0.2265, max mem: 12591.0, lr: 0.0001, time: 04m 24s 971ms, eta: 18h 01s 520ms 2020-03-06T01:15:55 INFO: Evaluation time. Running on full validation set... 2020-03-06T01:16:23 INFO: coco: full val:, 26000/50000, val/total_loss: 2.3091, val/caption_cross_entropy: 2.3091, val/caption_bleu4: 0.2294, validation time: 46m 14s 448ms, best iteration: 24000, best val/caption_bleu4: 0.231642 2020-03-06T01:20:48 INFO: coco:, 26100/50000, train/total_loss: 2.0290 (2.0763), train/caption_cross_entropy: 2.0290 (2.0763), train/caption_bleu4: 0.2667 (0.2643), val/total_loss: 2.4316, val/caption_cross_entropy: 2.4316, val/caption_bleu4: 0.2163, max mem: 12591.0, lr: 0.0001, time: 04m 58s 520ms, eta: 20h 11m 41s 958ms 2020-03-06T01:25:11 INFO: coco:, 26200/50000, train/total_loss: 2.0493 (2.0759), train/caption_cross_entropy: 2.0493 (2.0759), train/caption_bleu4: 0.2611 (0.2644), val/total_loss: 2.1272, val/caption_cross_entropy: 2.1272, val/caption_bleu4: 0.2323, max mem: 12591.0, lr: 0.0001, time: 04m 22s 348ms, eta: 17h 40m 25s 300ms 2020-03-06T01:29:41 INFO: coco:, 26300/50000, train/total_loss: 2.0561 (2.0756), train/caption_cross_entropy: 2.0561 (2.0756), train/caption_bleu4: 0.2583 (0.2643), val/total_loss: 2.4057, val/caption_cross_entropy: 2.4057, val/caption_bleu4: 0.2359, max mem: 12591.0, lr: 0.0001, time: 04m 23s 222ms, eta: 17h 39m 29s 052ms 2020-03-06T01:34:10 INFO: coco:, 26400/50000, train/total_loss: 2.0851 (2.0751), train/caption_cross_entropy: 2.0851 (2.0751), train/caption_bleu4: 0.2653 (0.2644), val/total_loss: 2.3319, val/caption_cross_entropy: 2.3319, val/caption_bleu4: 0.2309, max mem: 12591.0, lr: 0.0001, time: 04m 31s 964ms, eta: 18h 10m 03s 215ms 2020-03-06T01:38:36 INFO: coco:, 26500/50000, train/total_loss: 2.0534 (2.0749), train/caption_cross_entropy: 2.0534 (2.0749), train/caption_bleu4: 0.2669 (0.2644), val/total_loss: 2.3424, val/caption_cross_entropy: 2.3424, val/caption_bleu4: 0.2264, max mem: 12591.0, lr: 0.0001, time: 04m 24s 486ms, eta: 17h 35m 35s 363ms 2020-03-06T01:43:02 INFO: coco:, 26600/50000, train/total_loss: 2.0745 (2.0744), train/caption_cross_entropy: 2.0745 (2.0744), train/caption_bleu4: 0.2625 (0.2644), val/total_loss: 2.2971, val/caption_cross_entropy: 2.2971, val/caption_bleu4: 0.2290, max mem: 12591.0, lr: 0.0001, time: 04m 24s 727ms, eta: 17h 32m 03s 175ms 2020-03-06T01:47:34 INFO: coco:, 26700/50000, train/total_loss: 2.0316 (2.0736), train/caption_cross_entropy: 2.0316 (2.0736), train/caption_bleu4: 0.2712 (0.2646), val/total_loss: 2.2555, val/caption_cross_entropy: 2.2555, val/caption_bleu4: 0.2320, max mem: 12591.0, lr: 0.0001, time: 04m 33s 434ms, eta: 18h 02m 832ms 2020-03-06T01:52:02 INFO: coco:, 26800/50000, train/total_loss: 2.0379 (2.0730), train/caption_cross_entropy: 2.0379 (2.0730), train/caption_bleu4: 0.2652 (0.2647), val/total_loss: 2.2059, val/caption_cross_entropy: 2.2059, val/caption_bleu4: 0.2352, max mem: 12591.0, lr: 0.0001, time: 04m 26s 492ms, eta: 17h 30m 865ms 2020-03-06T01:56:36 INFO: coco:, 26900/50000, train/total_loss: 2.0427 (2.0727), train/caption_cross_entropy: 2.0427 (2.0727), train/caption_bleu4: 0.2637 (0.2649), val/total_loss: 2.3496, val/caption_cross_entropy: 2.3496, val/caption_bleu4: 0.2065, max mem: 12591.0, lr: 0.0001, time: 04m 33s 359ms, eta: 17h 52m 25s 760ms 2020-03-06T02:01:01 INFO: coco:, 27000/50000, train/total_loss: 2.0555 (2.0720), train/caption_cross_entropy: 2.0555 (2.0720), train/caption_bleu4: 0.2672 (0.2650), val/total_loss: 2.2115, val/caption_cross_entropy: 2.2115, val/caption_bleu4: 0.2334, max mem: 12591.0, lr: 0.0001, time: 04m 26s 298ms, eta: 17h 20m 12s 330ms 2020-03-06T02:01:01 INFO: Evaluation time. Running on full validation set... 2020-03-06T02:01:29 INFO: coco: full val:, 27000/50000, val/total_loss: 2.3066, val/caption_cross_entropy: 2.3066, val/caption_bleu4: 0.2291, validation time: 44m 59s 219ms, best iteration: 24000, best val/caption_bleu4: 0.231642 2020-03-06T02:05:47 INFO: coco:, 27100/50000, train/total_loss: 2.0563 (2.0714), train/caption_cross_entropy: 2.0563 (2.0714), train/caption_bleu4: 0.2635 (0.2651), val/total_loss: 2.2320, val/caption_cross_entropy: 2.2320, val/caption_bleu4: 0.2341, max mem: 12591.0, lr: 0.0001, time: 04m 52s 344ms, eta: 18h 56m 58s 791ms 2020-03-06T02:10:09 INFO: coco:, 27200/50000, train/total_loss: 2.0529 (2.0708), train/caption_cross_entropy: 2.0529 (2.0708), train/caption_bleu4: 0.2696 (0.2652), val/total_loss: 2.2671, val/caption_cross_entropy: 2.2671, val/caption_bleu4: 0.2235, max mem: 12591.0, lr: 0.0001, time: 04m 21s 142ms, eta: 16h 51m 11s 824ms 2020-03-06T02:14:42 INFO: coco:, 27300/50000, train/total_loss: 2.0466 (2.0703), train/caption_cross_entropy: 2.0466 (2.0703), train/caption_bleu4: 0.2743 (0.2653), val/total_loss: 2.3462, val/caption_cross_entropy: 2.3462, val/caption_bleu4: 0.2073, max mem: 12591.0, lr: 0.0001, time: 04m 30s 638ms, eta: 17h 23m 22s 258ms 2020-03-06T02:19:08 INFO: coco:, 27400/50000, train/total_loss: 2.0547 (2.0701), train/caption_cross_entropy: 2.0547 (2.0701), train/caption_bleu4: 0.2659 (0.2655), val/total_loss: 2.3202, val/caption_cross_entropy: 2.3202, val/caption_bleu4: 0.2221, max mem: 12591.0, lr: 0.0001, time: 04m 25s 245ms, eta: 16h 58m 04s 475ms 2020-03-06T02:23:34 INFO: coco:, 27500/50000, train/total_loss: 2.0800 (2.0698), train/caption_cross_entropy: 2.0800 (2.0698), train/caption_bleu4: 0.2668 (0.2655), val/total_loss: 2.3008, val/caption_cross_entropy: 2.3008, val/caption_bleu4: 0.2256, max mem: 12591.0, lr: 0.0001, time: 04m 25s 723ms, eta: 16h 55m 23s 866ms 2020-03-06T02:28:02 INFO: coco:, 27600/50000, train/total_loss: 2.0442 (2.0693), train/caption_cross_entropy: 2.0442 (2.0693), train/caption_bleu4: 0.2720 (0.2656), val/total_loss: 2.2564, val/caption_cross_entropy: 2.2564, val/caption_bleu4: 0.2322, max mem: 12591.0, lr: 0.0001, time: 04m 25s 508ms, eta: 16h 50m 03s 853ms 2020-03-06T02:32:32 INFO: coco:, 27700/50000, train/total_loss: 2.0381 (2.0689), train/caption_cross_entropy: 2.0381 (2.0689), train/caption_bleu4: 0.2591 (0.2656), val/total_loss: 2.2625, val/caption_cross_entropy: 2.2625, val/caption_bleu4: 0.2551, max mem: 12591.0, lr: 0.0001, time: 04m 31s 014ms, eta: 17h 06m 24s 460ms 2020-03-06T02:36:59 INFO: coco:, 27800/50000, train/total_loss: 2.0888 (2.0686), train/caption_cross_entropy: 2.0888 (2.0686), train/caption_bleu4: 0.2557 (0.2656), val/total_loss: 2.2201, val/caption_cross_entropy: 2.2201, val/caption_bleu4: 0.2650, max mem: 12591.0, lr: 0.0001, time: 04m 27s 655ms, eta: 16h 49m 08s 432ms 2020-03-06T02:41:25 INFO: coco:, 27900/50000, train/total_loss: 2.0550 (2.0683), train/caption_cross_entropy: 2.0550 (2.0683), train/caption_bleu4: 0.2643 (0.2656), val/total_loss: 2.3156, val/caption_cross_entropy: 2.3156, val/caption_bleu4: 0.2278, max mem: 12591.0, lr: 0.0001, time: 04m 24s 178ms, eta: 16h 31m 32s 823ms 2020-03-06T02:45:57 INFO: coco:, 28000/50000, train/total_loss: 2.0611 (2.0680), train/caption_cross_entropy: 2.0611 (2.0680), train/caption_bleu4: 0.2679 (0.2656), val/total_loss: 2.2580, val/caption_cross_entropy: 2.2580, val/caption_bleu4: 0.2442, max mem: 12591.0, lr: 0.0001, time: 04m 32s 234ms, eta: 16h 57m 09s 615ms 2020-03-06T02:45:57 INFO: Evaluation time. Running on full validation set... 2020-03-06T02:46:25 INFO: coco: full val:, 28000/50000, val/total_loss: 2.3107, val/caption_cross_entropy: 2.3107, val/caption_bleu4: 0.2289, validation time: 44m 49s 389ms, best iteration: 24000, best val/caption_bleu4: 0.231642 2020-03-06T02:50:49 INFO: coco:, 28100/50000, train/total_loss: 2.0576 (2.0678), train/caption_cross_entropy: 2.0576 (2.0678), train/caption_bleu4: 0.2581 (0.2656), val/total_loss: 2.2906, val/caption_cross_entropy: 2.2906, val/caption_bleu4: 0.2073, max mem: 12591.0, lr: 0.0001, time: 04m 59s 299ms, eta: 18h 33m 11s 893ms 2020-03-06T02:55:09 INFO: coco:, 28200/50000, train/total_loss: 2.0532 (2.0676), train/caption_cross_entropy: 2.0532 (2.0676), train/caption_bleu4: 0.2678 (0.2657), val/total_loss: 2.4127, val/caption_cross_entropy: 2.4127, val/caption_bleu4: 0.2401, max mem: 12591.0, lr: 0.0001, time: 04m 18s 859ms, eta: 15h 58m 23s 682ms 2020-03-06T02:59:39 INFO: coco:, 28300/50000, train/total_loss: 2.0477 (2.0673), train/caption_cross_entropy: 2.0477 (2.0673), train/caption_bleu4: 0.2714 (0.2657), val/total_loss: 2.2797, val/caption_cross_entropy: 2.2797, val/caption_bleu4: 0.2405, max mem: 12591.0, lr: 0.0001, time: 04m 23s 177ms, eta: 16h 09m 54s 505ms 2020-03-06T03:04:02 INFO: coco:, 28400/50000, train/total_loss: 2.0485 (2.0669), train/caption_cross_entropy: 2.0485 (2.0669), train/caption_bleu4: 0.2686 (0.2658), val/total_loss: 2.3379, val/caption_cross_entropy: 2.3379, val/caption_bleu4: 0.2303, max mem: 12591.0, lr: 0.0001, time: 04m 25s 998ms, eta: 16h 15m 47s 385ms 2020-03-06T03:08:33 INFO: coco:, 28500/50000, train/total_loss: 2.0497 (2.0667), train/caption_cross_entropy: 2.0497 (2.0667), train/caption_bleu4: 0.2638 (0.2658), val/total_loss: 2.3432, val/caption_cross_entropy: 2.3432, val/caption_bleu4: 0.2410, max mem: 12591.0, lr: 0.0001, time: 04m 28s 904ms, eta: 16h 21m 52s 869ms 2020-03-06T03:13:03 INFO: coco:, 28600/50000, train/total_loss: 2.0328 (2.0665), train/caption_cross_entropy: 2.0328 (2.0665), train/caption_bleu4: 0.2736 (0.2658), val/total_loss: 2.2799, val/caption_cross_entropy: 2.2799, val/caption_bleu4: 0.2129, max mem: 12591.0, lr: 0.0001, time: 04m 28s 131ms, eta: 16h 14m 30s 358ms 2020-03-06T03:17:30 INFO: coco:, 28700/50000, train/total_loss: 2.0613 (2.0663), train/caption_cross_entropy: 2.0613 (2.0663), train/caption_bleu4: 0.2609 (0.2658), val/total_loss: 2.2328, val/caption_cross_entropy: 2.2328, val/caption_bleu4: 0.2468, max mem: 12591.0, lr: 0.0001, time: 04m 30s 416ms, eta: 16h 18m 13s 064ms 2020-03-06T03:21:58 INFO: coco:, 28800/50000, train/total_loss: 2.0759 (2.0662), train/caption_cross_entropy: 2.0759 (2.0662), train/caption_bleu4: 0.2656 (0.2658), val/total_loss: 2.2418, val/caption_cross_entropy: 2.2418, val/caption_bleu4: 0.2467, max mem: 12591.0, lr: 0.0001, time: 04m 26s 064ms, eta: 15h 57m 57s 416ms 2020-03-06T03:26:29 INFO: coco:, 28900/50000, train/total_loss: 2.0224 (2.0659), train/caption_cross_entropy: 2.0224 (2.0659), train/caption_bleu4: 0.2708 (0.2659), val/total_loss: 2.3112, val/caption_cross_entropy: 2.3112, val/caption_bleu4: 0.2063, max mem: 12591.0, lr: 0.0001, time: 04m 34s 971ms, eta: 16h 25m 21s 251ms 2020-03-06T03:30:52 INFO: coco:, 29000/50000, train/total_loss: 2.0448 (2.0657), train/caption_cross_entropy: 2.0448 (2.0657), train/caption_bleu4: 0.2660 (0.2659), val/total_loss: 2.4859, val/caption_cross_entropy: 2.4859, val/caption_bleu4: 0.2089, max mem: 12591.0, lr: 0.0001, time: 04m 23s 546ms, eta: 15h 39m 56s 339ms 2020-03-06T03:30:52 INFO: Evaluation time. Running on full validation set... 2020-03-06T03:31:13 INFO: coco: full val:, 29000/50000, val/total_loss: 2.3132, val/caption_cross_entropy: 2.3132, val/caption_bleu4: 0.2289, validation time: 44m 48s 790ms, best iteration: 24000, best val/caption_bleu4: 0.231642 2020-03-06T03:35:39 INFO: coco:, 29100/50000, train/total_loss: 2.0604 (2.0656), train/caption_cross_entropy: 2.0604 (2.0656), train/caption_bleu4: 0.2604 (0.2659), val/total_loss: 2.3908, val/caption_cross_entropy: 2.3908, val/caption_bleu4: 0.2426, max mem: 12591.0, lr: 0.0001, time: 04m 49s 275ms, eta: 17h 06m 47s 362ms 2020-03-06T03:40:02 INFO: coco:, 29200/50000, train/total_loss: 2.0727 (2.0655), train/caption_cross_entropy: 2.0727 (2.0655), train/caption_bleu4: 0.2692 (0.2659), val/total_loss: 2.3610, val/caption_cross_entropy: 2.3610, val/caption_bleu4: 0.2254, max mem: 12591.0, lr: 0.0001, time: 04m 22s 864ms, eta: 15h 28m 34s 617ms 2020-03-06T03:44:29 INFO: coco:, 29300/50000, train/total_loss: 2.0791 (2.0654), train/caption_cross_entropy: 2.0791 (2.0654), train/caption_bleu4: 0.2640 (0.2659), val/total_loss: 2.2878, val/caption_cross_entropy: 2.2878, val/caption_bleu4: 0.2256, max mem: 12591.0, lr: 0.0001, time: 04m 21s 062ms, eta: 15h 17m 46s 670ms 2020-03-06T03:48:55 INFO: coco:, 29400/50000, train/total_loss: 2.0566 (2.0653), train/caption_cross_entropy: 2.0566 (2.0653), train/caption_bleu4: 0.2658 (0.2659), val/total_loss: 2.2468, val/caption_cross_entropy: 2.2468, val/caption_bleu4: 0.2787, max mem: 12698.0, lr: 0.0001, time: 04m 27s 565ms, eta: 15h 36m 05s 679ms 2020-03-06T03:53:23 INFO: coco:, 29500/50000, train/total_loss: 2.0627 (2.0651), train/caption_cross_entropy: 2.0627 (2.0651), train/caption_bleu4: 0.2699 (0.2660), val/total_loss: 2.2228, val/caption_cross_entropy: 2.2228, val/caption_bleu4: 0.2309, max mem: 12698.0, lr: 0.0001, time: 04m 25s 934ms, eta: 15h 25m 52s 293ms 2020-03-06T03:57:48 INFO: coco:, 29600/50000, train/total_loss: 2.0629 (2.0649), train/caption_cross_entropy: 2.0629 (2.0649), train/caption_bleu4: 0.2662 (0.2660), val/total_loss: 2.3599, val/caption_cross_entropy: 2.3599, val/caption_bleu4: 0.2201, max mem: 12698.0, lr: 0.0001, time: 04m 25s 581ms, eta: 15h 20m 07s 995ms 2020-03-06T04:02:17 INFO: coco:, 29700/50000, train/total_loss: 2.0629 (2.0647), train/caption_cross_entropy: 2.0629 (2.0647), train/caption_bleu4: 0.2625 (0.2660), val/total_loss: 2.2739, val/caption_cross_entropy: 2.2739, val/caption_bleu4: 0.2278, max mem: 12698.0, lr: 0.0001, time: 04m 29s 345ms, eta: 15h 28m 35s 998ms 2020-03-06T04:06:45 INFO: coco:, 29800/50000, train/total_loss: 2.0500 (2.0645), train/caption_cross_entropy: 2.0500 (2.0645), train/caption_bleu4: 0.2649 (0.2660), val/total_loss: 2.3912, val/caption_cross_entropy: 2.3912, val/caption_bleu4: 0.2090, max mem: 12698.0, lr: 0.0001, time: 04m 26s 883ms, eta: 15h 15m 34s 699ms 2020-03-06T04:11:12 INFO: coco:, 29900/50000, train/total_loss: 2.0528 (2.0643), train/caption_cross_entropy: 2.0528 (2.0643), train/caption_bleu4: 0.2721 (0.2661), val/total_loss: 2.3727, val/caption_cross_entropy: 2.3727, val/caption_bleu4: 0.2120, max mem: 12698.0, lr: 0.0001, time: 04m 27s 658ms, eta: 15h 13m 41s 456ms 2020-03-06T04:15:40 INFO: coco:, 30000/50000, train/total_loss: 2.0253 (2.0641), train/caption_cross_entropy: 2.0253 (2.0641), train/caption_bleu4: 0.2751 (0.2661), val/total_loss: 2.3611, val/caption_cross_entropy: 2.3611, val/caption_bleu4: 0.2208, max mem: 12698.0, lr: 0.0001, time: 04m 27s 498ms, eta: 15h 08m 36s 121ms 2020-03-06T04:15:40 INFO: Evaluation time. Running on full validation set... 2020-03-06T04:16:02 INFO: coco: full val:, 30000/50000, val/total_loss: 2.3090, val/caption_cross_entropy: 2.3090, val/caption_bleu4: 0.2286, validation time: 44m 48s 449ms, best iteration: 24000, best val/caption_bleu4: 0.231642 2020-03-06T04:20:25 INFO: coco:, 30100/50000, train/total_loss: 2.0540 (2.0637), train/caption_cross_entropy: 2.0540 (2.0637), train/caption_bleu4: 0.2687 (0.2661), val/total_loss: 2.3554, val/caption_cross_entropy: 2.3554, val/caption_bleu4: 0.2061, max mem: 12698.0, lr: 0.0001, time: 04m 51s 560ms, eta: 16h 25m 23s 003ms 2020-03-06T04:24:49 INFO: coco:, 30200/50000, train/total_loss: 2.0637 (2.0636), train/caption_cross_entropy: 2.0637 (2.0636), train/caption_bleu4: 0.2627 (0.2662), val/total_loss: 2.5368, val/caption_cross_entropy: 2.5368, val/caption_bleu4: 0.1955, max mem: 12698.0, lr: 0.0001, time: 04m 24s 376ms, eta: 14h 49m 01s 106ms 2020-03-06T04:29:20 INFO: coco:, 30300/50000, train/total_loss: 2.0493 (2.0634), train/caption_cross_entropy: 2.0493 (2.0634), train/caption_bleu4: 0.2636 (0.2662), val/total_loss: 2.3715, val/caption_cross_entropy: 2.3715, val/caption_bleu4: 0.2392, max mem: 12698.0, lr: 0.0001, time: 04m 22s 672ms, eta: 14h 38m 49s 607ms 2020-03-06T04:33:49 INFO: coco:, 30400/50000, train/total_loss: 2.0393 (2.0633), train/caption_cross_entropy: 2.0393 (2.0633), train/caption_bleu4: 0.2645 (0.2663), val/total_loss: 2.3135, val/caption_cross_entropy: 2.3135, val/caption_bleu4: 0.2309, max mem: 12698.0, lr: 0.0001, time: 04m 32s 293ms, eta: 15h 06m 23s 618ms 2020-03-06T04:38:24 INFO: coco:, 30500/50000, train/total_loss: 2.0142 (2.0630), train/caption_cross_entropy: 2.0142 (2.0630), train/caption_bleu4: 0.2721 (0.2663), val/total_loss: 2.3055, val/caption_cross_entropy: 2.3055, val/caption_bleu4: 0.2543, max mem: 12698.0, lr: 0.0001, time: 04m 37s 740ms, eta: 15h 19m 48s 335ms 2020-03-06T04:42:55 INFO: coco:, 30600/50000, train/total_loss: 2.0503 (2.0629), train/caption_cross_entropy: 2.0503 (2.0629), train/caption_bleu4: 0.2681 (0.2663), val/total_loss: 2.3155, val/caption_cross_entropy: 2.3155, val/caption_bleu4: 0.2302, max mem: 12698.0, lr: 0.0001, time: 04m 29s 831ms, eta: 14h 49m 01s 853ms 2020-03-06T04:47:24 INFO: coco:, 30700/50000, train/total_loss: 2.0432 (2.0627), train/caption_cross_entropy: 2.0432 (2.0627), train/caption_bleu4: 0.2629 (0.2663), val/total_loss: 2.2670, val/caption_cross_entropy: 2.2670, val/caption_bleu4: 0.2553, max mem: 12698.0, lr: 0.0001, time: 04m 27s 251ms, eta: 14h 35m 59s 604ms 2020-03-06T04:51:50 INFO: coco:, 30800/50000, train/total_loss: 2.0625 (2.0625), train/caption_cross_entropy: 2.0625 (2.0625), train/caption_bleu4: 0.2683 (0.2664), val/total_loss: 2.3622, val/caption_cross_entropy: 2.3622, val/caption_bleu4: 0.2323, max mem: 12698.0, lr: 0.0001, time: 04m 25s 453ms, eta: 14h 25m 35s 438ms 2020-03-06T04:56:21 INFO: coco:, 30900/50000, train/total_loss: 2.0367 (2.0624), train/caption_cross_entropy: 2.0367 (2.0624), train/caption_bleu4: 0.2663 (0.2664), val/total_loss: 2.2573, val/caption_cross_entropy: 2.2573, val/caption_bleu4: 0.2383, max mem: 12698.0, lr: 0.0001, time: 04m 30s 378ms, eta: 14h 37m 03s 573ms 2020-03-06T05:00:50 INFO: coco:, 31000/50000, train/total_loss: 2.0652 (2.0622), train/caption_cross_entropy: 2.0652 (2.0622), train/caption_bleu4: 0.2614 (0.2664), val/total_loss: 2.2179, val/caption_cross_entropy: 2.2179, val/caption_bleu4: 0.2344, max mem: 12698.0, lr: 0.0001, time: 04m 29s 274ms, eta: 14h 28m 54s 174ms 2020-03-06T05:00:50 INFO: Evaluation time. Running on full validation set... 2020-03-06T05:01:11 INFO: coco: full val:, 31000/50000, val/total_loss: 2.3110, val/caption_cross_entropy: 2.3110, val/caption_bleu4: 0.2293, validation time: 45m 09s 147ms, best iteration: 24000, best val/caption_bleu4: 0.231642 2020-03-06T05:05:38 INFO: coco:, 31100/50000, train/total_loss: 2.0619 (2.0620), train/caption_cross_entropy: 2.0619 (2.0620), train/caption_bleu4: 0.2724 (0.2665), val/total_loss: 2.2657, val/caption_cross_entropy: 2.2657, val/caption_bleu4: 0.2336, max mem: 12698.0, lr: 0.0001, time: 04m 54s 307ms, eta: 15h 44m 40s 890ms 2020-03-06T05:10:02 INFO: coco:, 31200/50000, train/total_loss: 2.0636 (2.0618), train/caption_cross_entropy: 2.0636 (2.0618), train/caption_bleu4: 0.2619 (0.2664), val/total_loss: 2.4006, val/caption_cross_entropy: 2.4006, val/caption_bleu4: 0.2152, max mem: 12698.0, lr: 0.0001, time: 04m 22s 967ms, eta: 13h 59m 37s 139ms 2020-03-06T05:14:31 INFO: coco:, 31300/50000, train/total_loss: 2.0374 (2.0616), train/caption_cross_entropy: 2.0374 (2.0616), train/caption_bleu4: 0.2686 (0.2665), val/total_loss: 2.3432, val/caption_cross_entropy: 2.3432, val/caption_bleu4: 0.2169, max mem: 12698.0, lr: 0.0001, time: 04m 22s 374ms, eta: 13h 53m 16s 199ms 2020-03-06T05:18:59 INFO: coco:, 31400/50000, train/total_loss: 2.0740 (2.0615), train/caption_cross_entropy: 2.0740 (2.0615), train/caption_bleu4: 0.2656 (0.2665), val/total_loss: 2.3268, val/caption_cross_entropy: 2.3268, val/caption_bleu4: 0.2300, max mem: 12698.0, lr: 0.0001, time: 04m 30s 174ms, eta: 14h 13m 27s 246ms 2020-03-06T05:23:30 INFO: coco:, 31500/50000, train/total_loss: 2.0532 (2.0614), train/caption_cross_entropy: 2.0532 (2.0614), train/caption_bleu4: 0.2670 (0.2665), val/total_loss: 2.3835, val/caption_cross_entropy: 2.3835, val/caption_bleu4: 0.2362, max mem: 12698.0, lr: 0.0001, time: 04m 30s 991ms, eta: 14h 11m 25s 880ms 2020-03-06T05:28:02 INFO: coco:, 31600/50000, train/total_loss: 2.0529 (2.0612), train/caption_cross_entropy: 2.0529 (2.0612), train/caption_bleu4: 0.2652 (0.2666), val/total_loss: 2.3463, val/caption_cross_entropy: 2.3463, val/caption_bleu4: 0.2428, max mem: 12698.0, lr: 0.0001, time: 04m 29s 523ms, eta: 14h 02m 14s 538ms 2020-03-06T05:32:35 INFO: coco:, 31700/50000, train/total_loss: 2.0414 (2.0611), train/caption_cross_entropy: 2.0414 (2.0611), train/caption_bleu4: 0.2627 (0.2666), val/total_loss: 2.4478, val/caption_cross_entropy: 2.4478, val/caption_bleu4: 0.2109, max mem: 12698.0, lr: 0.0001, time: 04m 34s 384ms, eta: 14h 12m 46s 325ms 2020-03-06T05:37:08 INFO: coco:, 31800/50000, train/total_loss: 2.0749 (2.0612), train/caption_cross_entropy: 2.0749 (2.0612), train/caption_bleu4: 0.2560 (0.2666), val/total_loss: 2.3400, val/caption_cross_entropy: 2.3400, val/caption_bleu4: 0.2461, max mem: 12698.0, lr: 0.0001, time: 04m 32s 570ms, eta: 14h 02m 30s 366ms 2020-03-06T05:41:42 INFO: coco:, 31900/50000, train/total_loss: 2.0371 (2.0610), train/caption_cross_entropy: 2.0371 (2.0610), train/caption_bleu4: 0.2534 (0.2666), val/total_loss: 2.2167, val/caption_cross_entropy: 2.2167, val/caption_bleu4: 0.2407, max mem: 12698.0, lr: 0.0001, time: 04m 33s 309ms, eta: 14h 08s 862ms 2020-03-06T05:46:08 INFO: coco:, 32000/50000, train/total_loss: 2.0333 (2.0609), train/caption_cross_entropy: 2.0333 (2.0609), train/caption_bleu4: 0.2684 (0.2666), val/total_loss: 2.2568, val/caption_cross_entropy: 2.2568, val/caption_bleu4: 0.2370, max mem: 12698.0, lr: 0.0001, time: 04m 26s 676ms, eta: 13h 35m 13s 784ms 2020-03-06T05:46:08 INFO: Evaluation time. Running on full validation set... 2020-03-06T05:46:29 INFO: coco: full val:, 32000/50000, val/total_loss: 2.3110, val/caption_cross_entropy: 2.3110, val/caption_bleu4: 0.2288, validation time: 45m 18s 242ms, best iteration: 24000, best val/caption_bleu4: 0.231642 2020-03-06T05:50:59 INFO: coco:, 32100/50000, train/total_loss: 2.0291 (2.0608), train/caption_cross_entropy: 2.0291 (2.0608), train/caption_bleu4: 0.2659 (0.2666), val/total_loss: 2.4073, val/caption_cross_entropy: 2.4073, val/caption_bleu4: 0.2407, max mem: 12698.0, lr: 0.0001, time: 04m 57s 567ms, eta: 15h 04m 36s 574ms 2020-03-06T05:55:20 INFO: coco:, 32200/50000, train/total_loss: 2.0226 (2.0606), train/caption_cross_entropy: 2.0226 (2.0606), train/caption_bleu4: 0.2647 (0.2667), val/total_loss: 2.4364, val/caption_cross_entropy: 2.4364, val/caption_bleu4: 0.2300, max mem: 12698.0, lr: 0.0001, time: 04m 19s 601ms, eta: 13h 04m 47s 078ms 2020-03-06T05:59:46 INFO: coco:, 32300/50000, train/total_loss: 2.0185 (2.0605), train/caption_cross_entropy: 2.0185 (2.0605), train/caption_bleu4: 0.2714 (0.2667), val/total_loss: 2.3420, val/caption_cross_entropy: 2.3420, val/caption_bleu4: 0.2332, max mem: 12698.0, lr: 0.0001, time: 04m 22s 097ms, eta: 13h 07m 52s 647ms 2020-03-06T06:04:12 INFO: coco:, 32400/50000, train/total_loss: 2.0551 (2.0604), train/caption_cross_entropy: 2.0551 (2.0604), train/caption_bleu4: 0.2643 (0.2667), val/total_loss: 2.4056, val/caption_cross_entropy: 2.4056, val/caption_bleu4: 0.2016, max mem: 12698.0, lr: 0.0001, time: 04m 27s 024ms, eta: 13h 18m 09s 163ms 2020-03-06T06:08:43 INFO: coco:, 32500/50000, train/total_loss: 2.0328 (2.0602), train/caption_cross_entropy: 2.0328 (2.0602), train/caption_bleu4: 0.2668 (0.2667), val/total_loss: 2.2797, val/caption_cross_entropy: 2.2797, val/caption_bleu4: 0.2449, max mem: 12698.0, lr: 0.0001, time: 04m 30s 121ms, eta: 13h 22m 49s 460ms 2020-03-06T06:13:11 INFO: coco:, 32600/50000, train/total_loss: 2.0331 (2.0602), train/caption_cross_entropy: 2.0331 (2.0602), train/caption_bleu4: 0.2650 (0.2667), val/total_loss: 2.3075, val/caption_cross_entropy: 2.3075, val/caption_bleu4: 0.2286, max mem: 12698.0, lr: 0.0001, time: 04m 25s 760ms, eta: 13h 05m 20s 883ms 2020-03-06T06:17:40 INFO: coco:, 32700/50000, train/total_loss: 2.0297 (2.0600), train/caption_cross_entropy: 2.0297 (2.0600), train/caption_bleu4: 0.2736 (0.2667), val/total_loss: 2.2617, val/caption_cross_entropy: 2.2617, val/caption_bleu4: 0.2597, max mem: 12698.0, lr: 0.0001, time: 04m 30s 556ms, eta: 13h 14m 55s 677ms 2020-03-06T06:22:09 INFO: coco:, 32800/50000, train/total_loss: 2.0288 (2.0599), train/caption_cross_entropy: 2.0288 (2.0599), train/caption_bleu4: 0.2658 (0.2668), val/total_loss: 2.2108, val/caption_cross_entropy: 2.2108, val/caption_bleu4: 0.2426, max mem: 12698.0, lr: 0.0001, time: 04m 28s 741ms, eta: 13h 05m 01s 749ms 2020-03-06T06:26:34 INFO: coco:, 32900/50000, train/total_loss: 2.0289 (2.0597), train/caption_cross_entropy: 2.0289 (2.0597), train/caption_bleu4: 0.2724 (0.2668), val/total_loss: 2.3051, val/caption_cross_entropy: 2.3051, val/caption_bleu4: 0.2337, max mem: 12698.0, lr: 0.0001, time: 04m 24s 448ms, eta: 12h 47m 59s 909ms 2020-03-06T06:31:02 INFO: coco:, 33000/50000, train/total_loss: 2.0456 (2.0596), train/caption_cross_entropy: 2.0456 (2.0596), train/caption_bleu4: 0.2643 (0.2668), val/total_loss: 2.3070, val/caption_cross_entropy: 2.3070, val/caption_bleu4: 0.2386, max mem: 12698.0, lr: 0.0001, time: 04m 28s 529ms, eta: 12h 55m 17s 428ms 2020-03-06T06:31:02 INFO: Evaluation time. Running on full validation set... 2020-03-06T06:31:23 INFO: coco: full val:, 33000/50000, val/total_loss: 2.3064, val/caption_cross_entropy: 2.3064, val/caption_bleu4: 0.2302, validation time: 44m 54s 137ms, best iteration: 24000, best val/caption_bleu4: 0.231642 2020-03-06T06:35:44 INFO: coco:, 33100/50000, train/total_loss: 2.0635 (2.0594), train/caption_cross_entropy: 2.0635 (2.0594), train/caption_bleu4: 0.2660 (0.2668), val/total_loss: 2.2360, val/caption_cross_entropy: 2.2360, val/caption_bleu4: 0.2288, max mem: 12698.0, lr: 0.0001, time: 04m 48s 491ms, eta: 13h 48m 01s 344ms 2020-03-06T06:40:09 INFO: coco:, 33200/50000, train/total_loss: 2.0511 (2.0592), train/caption_cross_entropy: 2.0511 (2.0592), train/caption_bleu4: 0.2658 (0.2668), val/total_loss: 2.3099, val/caption_cross_entropy: 2.3099, val/caption_bleu4: 0.2279, max mem: 12698.0, lr: 0.0001, time: 04m 24s 442ms, eta: 12h 34m 30s 456ms 2020-03-06T06:44:43 INFO: coco:, 33300/50000, train/total_loss: 2.0553 (2.0592), train/caption_cross_entropy: 2.0553 (2.0592), train/caption_bleu4: 0.2683 (0.2668), val/total_loss: 2.2793, val/caption_cross_entropy: 2.2793, val/caption_bleu4: 0.2327, max mem: 12698.0, lr: 0.0001, time: 04m 26s 444ms, eta: 12h 35m 41s 663ms 2020-03-06T06:49:17 INFO: coco:, 33400/50000, train/total_loss: 2.0521 (2.0591), train/caption_cross_entropy: 2.0521 (2.0591), train/caption_bleu4: 0.2616 (0.2669), val/total_loss: 2.3072, val/caption_cross_entropy: 2.3072, val/caption_bleu4: 0.2264, max mem: 12698.0, lr: 0.0001, time: 04m 36s 619ms, eta: 12h 59m 51s 240ms 2020-03-06T06:53:44 INFO: coco:, 33500/50000, train/total_loss: 2.0293 (2.0591), train/caption_cross_entropy: 2.0293 (2.0591), train/caption_bleu4: 0.2590 (0.2669), val/total_loss: 2.3339, val/caption_cross_entropy: 2.3339, val/caption_bleu4: 0.2349, max mem: 12698.0, lr: 0.0001, time: 04m 26s 082ms, eta: 12h 25m 37s 766ms 2020-03-06T06:58:15 INFO: coco:, 33600/50000, train/total_loss: 2.0201 (2.0590), train/caption_cross_entropy: 2.0201 (2.0590), train/caption_bleu4: 0.2664 (0.2669), val/total_loss: 2.2768, val/caption_cross_entropy: 2.2768, val/caption_bleu4: 0.2057, max mem: 12698.0, lr: 0.0001, time: 04m 30s 137ms, eta: 12h 32m 24s 266ms 2020-03-06T07:02:46 INFO: coco:, 33700/50000, train/total_loss: 2.0429 (2.0590), train/caption_cross_entropy: 2.0429 (2.0590), train/caption_bleu4: 0.2705 (0.2669), val/total_loss: 2.4480, val/caption_cross_entropy: 2.4480, val/caption_bleu4: 0.2283, max mem: 12698.0, lr: 0.0001, time: 04m 34s 260ms, eta: 12h 39m 13s 871ms 2020-03-06T07:07:18 INFO: coco:, 33800/50000, train/total_loss: 2.0521 (2.0588), train/caption_cross_entropy: 2.0521 (2.0588), train/caption_bleu4: 0.2681 (0.2669), val/total_loss: 2.4247, val/caption_cross_entropy: 2.4247, val/caption_bleu4: 0.2267, max mem: 12698.0, lr: 0.0001, time: 04m 34s 232ms, eta: 12h 34m 29s 835ms 2020-03-06T07:11:45 INFO: coco:, 33900/50000, train/total_loss: 2.0283 (2.0587), train/caption_cross_entropy: 2.0283 (2.0587), train/caption_bleu4: 0.2664 (0.2670), val/total_loss: 2.3275, val/caption_cross_entropy: 2.3275, val/caption_bleu4: 0.2347, max mem: 12698.0, lr: 0.0001, time: 04m 23s 871ms, eta: 12h 01m 30s 427ms 2020-03-06T07:16:11 INFO: coco:, 34000/50000, train/total_loss: 2.0605 (2.0586), train/caption_cross_entropy: 2.0605 (2.0586), train/caption_bleu4: 0.2639 (0.2670), val/total_loss: 2.4378, val/caption_cross_entropy: 2.4378, val/caption_bleu4: 0.2258, max mem: 12698.0, lr: 0.0001, time: 04m 25s 869ms, eta: 12h 02m 27s 295ms 2020-03-06T07:16:11 INFO: Evaluation time. Running on full validation set... 2020-03-06T07:16:33 INFO: coco: full val:, 34000/50000, val/total_loss: 2.3090, val/caption_cross_entropy: 2.3090, val/caption_bleu4: 0.2287, validation time: 45m 09s 211ms, best iteration: 24000, best val/caption_bleu4: 0.231642 2020-03-06T07:21:03 INFO: coco:, 34100/50000, train/total_loss: 2.0502 (2.0585), train/caption_cross_entropy: 2.0502 (2.0585), train/caption_bleu4: 0.2617 (0.2670), val/total_loss: 2.0911, val/caption_cross_entropy: 2.0911, val/caption_bleu4: 0.2521, max mem: 12698.0, lr: 0.0001, time: 04m 56s 830ms, eta: 13h 21m 32s 771ms 2020-03-06T07:25:29 INFO: coco:, 34200/50000, train/total_loss: 2.0268 (2.0585), train/caption_cross_entropy: 2.0268 (2.0585), train/caption_bleu4: 0.2697 (0.2670), val/total_loss: 2.3132, val/caption_cross_entropy: 2.3132, val/caption_bleu4: 0.2288, max mem: 12698.0, lr: 0.0001, time: 04m 24s 655ms, eta: 11h 50m 10s 094ms 2020-03-06T07:29:58 INFO: coco:, 34300/50000, train/total_loss: 2.0115 (2.0584), train/caption_cross_entropy: 2.0115 (2.0584), train/caption_bleu4: 0.2705 (0.2670), val/total_loss: 2.2071, val/caption_cross_entropy: 2.2071, val/caption_bleu4: 0.2502, max mem: 12698.0, lr: 0.0001, time: 04m 24s 033ms, eta: 11h 44m 915ms 2020-03-06T07:34:23 INFO: coco:, 34400/50000, train/total_loss: 2.0328 (2.0583), train/caption_cross_entropy: 2.0328 (2.0583), train/caption_bleu4: 0.2811 (0.2670), val/total_loss: 2.2062, val/caption_cross_entropy: 2.2062, val/caption_bleu4: 0.2418, max mem: 12698.0, lr: 0.0001, time: 04m 30s 551ms, eta: 11h 56m 47s 898ms 2020-03-06T07:38:49 INFO: coco:, 34500/50000, train/total_loss: 2.0538 (2.0582), train/caption_cross_entropy: 2.0538 (2.0582), train/caption_bleu4: 0.2606 (0.2670), val/total_loss: 2.2561, val/caption_cross_entropy: 2.2561, val/caption_bleu4: 0.2192, max mem: 12698.0, lr: 0.0001, time: 04m 22s 591ms, eta: 11h 31m 14s 984ms 2020-03-06T07:43:22 INFO: coco:, 34600/50000, train/total_loss: 1.9960 (2.0580), train/caption_cross_entropy: 1.9960 (2.0580), train/caption_bleu4: 0.2712 (0.2671), val/total_loss: 2.3194, val/caption_cross_entropy: 2.3194, val/caption_bleu4: 0.2235, max mem: 12698.0, lr: 0.0001, time: 04m 31s 641ms, eta: 11h 50m 27s 590ms 2020-03-06T07:47:48 INFO: coco:, 34700/50000, train/total_loss: 2.0663 (2.0579), train/caption_cross_entropy: 2.0663 (2.0579), train/caption_bleu4: 0.2720 (0.2671), val/total_loss: 2.3229, val/caption_cross_entropy: 2.3229, val/caption_bleu4: 0.2033, max mem: 12698.0, lr: 0.0001, time: 04m 26s 350ms, eta: 11h 32m 05s 921ms 2020-03-06T07:52:15 INFO: coco:, 34800/50000, train/total_loss: 2.0808 (2.0578), train/caption_cross_entropy: 2.0808 (2.0578), train/caption_bleu4: 0.2634 (0.2671), val/total_loss: 2.2588, val/caption_cross_entropy: 2.2588, val/caption_bleu4: 0.2206, max mem: 12698.0, lr: 0.0001, time: 04m 28s 166ms, eta: 11h 32m 15s 753ms 2020-03-06T07:56:46 INFO: coco:, 34900/50000, train/total_loss: 2.0555 (2.0577), train/caption_cross_entropy: 2.0555 (2.0577), train/caption_bleu4: 0.2733 (0.2671), val/total_loss: 2.2337, val/caption_cross_entropy: 2.2337, val/caption_bleu4: 0.2389, max mem: 12698.0, lr: 0.0001, time: 04m 28s 278ms, eta: 11h 27m 59s 772ms 2020-03-06T08:01:16 INFO: coco:, 35000/50000, train/total_loss: 2.0430 (2.0576), train/caption_cross_entropy: 2.0430 (2.0576), train/caption_bleu4: 0.2734 (0.2671), val/total_loss: 2.2981, val/caption_cross_entropy: 2.2981, val/caption_bleu4: 0.2343, max mem: 12698.0, lr: 0.00001, time: 04m 31s 583ms, eta: 11h 31m 51s 572ms 2020-03-06T08:01:16 INFO: Evaluation time. Running on full validation set... 2020-03-06T08:01:38 INFO: coco: full val:, 35000/50000, val/total_loss: 2.3103, val/caption_cross_entropy: 2.3103, val/caption_bleu4: 0.2281, validation time: 45m 05s 271ms, best iteration: 24000, best val/caption_bleu4: 0.231642 2020-03-06T08:06:11 INFO: coco:, 35100/50000, train/total_loss: 2.0601 (2.0576), train/caption_cross_entropy: 2.0601 (2.0576), train/caption_bleu4: 0.2625 (0.2671), val/total_loss: 2.3899, val/caption_cross_entropy: 2.3899, val/caption_bleu4: 0.2350, max mem: 12698.0, lr: 0.00001, time: 04m 59s 644ms, eta: 12h 38m 15s 369ms 2020-03-06T08:10:32 INFO: coco:, 35200/50000, train/total_loss: 2.0149 (2.0574), train/caption_cross_entropy: 2.0149 (2.0574), train/caption_bleu4: 0.2712 (0.2672), val/total_loss: 2.2896, val/caption_cross_entropy: 2.2896, val/caption_bleu4: 0.2239, max mem: 12698.0, lr: 0.00001, time: 04m 20s 212ms, eta: 10h 54m 03s 201ms 2020-03-06T08:15:00 INFO: coco:, 35300/50000, train/total_loss: 2.0462 (2.0573), train/caption_cross_entropy: 2.0462 (2.0573), train/caption_bleu4: 0.2724 (0.2672), val/total_loss: 2.2071, val/caption_cross_entropy: 2.2071, val/caption_bleu4: 0.2334, max mem: 12698.0, lr: 0.00001, time: 04m 22s 875ms, eta: 10h 56m 16s 973ms 2020-03-06T08:19:28 INFO: coco:, 35400/50000, train/total_loss: 2.0579 (2.0573), train/caption_cross_entropy: 2.0579 (2.0573), train/caption_bleu4: 0.2592 (0.2672), val/total_loss: 2.2985, val/caption_cross_entropy: 2.2985, val/caption_bleu4: 0.2293, max mem: 12698.0, lr: 0.00001, time: 04m 31s 216ms, eta: 11h 12m 29s 963ms 2020-03-06T08:23:55 INFO: coco:, 35500/50000, train/total_loss: 2.0142 (2.0573), train/caption_cross_entropy: 2.0142 (2.0573), train/caption_bleu4: 0.2730 (0.2672), val/total_loss: 2.2559, val/caption_cross_entropy: 2.2559, val/caption_bleu4: 0.2194, max mem: 12698.0, lr: 0.00001, time: 04m 26s 716ms, eta: 10h 56m 48s 738ms 2020-03-06T08:28:25 INFO: coco:, 35600/50000, train/total_loss: 2.0536 (2.0571), train/caption_cross_entropy: 2.0536 (2.0571), train/caption_bleu4: 0.2637 (0.2672), val/total_loss: 2.2544, val/caption_cross_entropy: 2.2544, val/caption_bleu4: 0.2546, max mem: 12698.0, lr: 0.00001, time: 04m 28s 194ms, eta: 10h 55m 53s 753ms 2020-03-06T08:32:57 INFO: coco:, 35700/50000, train/total_loss: 2.0360 (2.0571), train/caption_cross_entropy: 2.0360 (2.0571), train/caption_bleu4: 0.2617 (0.2672), val/total_loss: 2.2128, val/caption_cross_entropy: 2.2128, val/caption_bleu4: 0.2279, max mem: 12698.0, lr: 0.00001, time: 04m 32s 278ms, eta: 11h 01m 15s 608ms 2020-03-06T08:37:23 INFO: coco:, 35800/50000, train/total_loss: 2.0200 (2.0569), train/caption_cross_entropy: 2.0200 (2.0569), train/caption_bleu4: 0.2738 (0.2672), val/total_loss: 2.3775, val/caption_cross_entropy: 2.3775, val/caption_bleu4: 0.2271, max mem: 12698.0, lr: 0.00001, time: 04m 25s 536ms, eta: 10h 40m 22s 579ms 2020-03-06T08:41:50 INFO: coco:, 35900/50000, train/total_loss: 2.0379 (2.0568), train/caption_cross_entropy: 2.0379 (2.0568), train/caption_bleu4: 0.2783 (0.2673), val/total_loss: 2.3802, val/caption_cross_entropy: 2.3802, val/caption_bleu4: 0.2252, max mem: 12698.0, lr: 0.00001, time: 04m 26s 559ms, eta: 10h 38m 19s 057ms 2020-03-06T08:46:23 INFO: coco:, 36000/50000, train/total_loss: 2.0458 (2.0567), train/caption_cross_entropy: 2.0458 (2.0567), train/caption_bleu4: 0.2674 (0.2673), val/total_loss: 2.1481, val/caption_cross_entropy: 2.1481, val/caption_bleu4: 0.2447, max mem: 12698.0, lr: 0.00001, time: 04m 34s 067ms, eta: 10h 51m 38s 506ms 2020-03-06T08:46:23 INFO: Evaluation time. Running on full validation set... 2020-03-06T08:46:45 INFO: coco: full val:, 36000/50000, val/total_loss: 2.3077, val/caption_cross_entropy: 2.3077, val/caption_bleu4: 0.2286, validation time: 45m 07s 287ms, best iteration: 24000, best val/caption_bleu4: 0.231642 2020-03-06T08:51:09 INFO: coco:, 36100/50000, train/total_loss: 2.0608 (2.0566), train/caption_cross_entropy: 2.0608 (2.0566), train/caption_bleu4: 0.2638 (0.2673), val/total_loss: 2.3247, val/caption_cross_entropy: 2.3247, val/caption_bleu4: 0.2352, max mem: 12698.0, lr: 0.00001, time: 04m 50s 068ms, eta: 11h 24m 45s 538ms 2020-03-06T08:55:33 INFO: coco:, 36200/50000, train/total_loss: 2.0654 (2.0565), train/caption_cross_entropy: 2.0654 (2.0565), train/caption_bleu4: 0.2696 (0.2673), val/total_loss: 2.3483, val/caption_cross_entropy: 2.3483, val/caption_bleu4: 0.2327, max mem: 12698.0, lr: 0.00001, time: 04m 24s 454ms, eta: 10h 19m 48s 126ms 2020-03-06T09:00:04 INFO: coco:, 36300/50000, train/total_loss: 2.0372 (2.0564), train/caption_cross_entropy: 2.0372 (2.0564), train/caption_bleu4: 0.2629 (0.2673), val/total_loss: 2.2913, val/caption_cross_entropy: 2.2913, val/caption_bleu4: 0.2363, max mem: 12698.0, lr: 0.00001, time: 04m 24s 009ms, eta: 10h 14m 16s 548ms 2020-03-06T09:04:33 INFO: coco:, 36400/50000, train/total_loss: 2.0318 (2.0563), train/caption_cross_entropy: 2.0318 (2.0563), train/caption_bleu4: 0.2592 (0.2673), val/total_loss: 2.1076, val/caption_cross_entropy: 2.1076, val/caption_bleu4: 0.2324, max mem: 12698.0, lr: 0.00001, time: 04m 30s 688ms, eta: 10h 25m 13s 096ms 2020-03-06T09:08:57 INFO: coco:, 36500/50000, train/total_loss: 2.0170 (2.0562), train/caption_cross_entropy: 2.0170 (2.0562), train/caption_bleu4: 0.2719 (0.2673), val/total_loss: 2.2360, val/caption_cross_entropy: 2.2360, val/caption_bleu4: 0.2371, max mem: 12698.0, lr: 0.00001, time: 04m 23s 938ms, eta: 10h 05m 08s 690ms 2020-03-06T09:13:33 INFO: coco:, 36600/50000, train/total_loss: 2.0321 (2.0561), train/caption_cross_entropy: 2.0321 (2.0561), train/caption_bleu4: 0.2676 (0.2674), val/total_loss: 2.2622, val/caption_cross_entropy: 2.2622, val/caption_bleu4: 0.2384, max mem: 12698.0, lr: 0.00001, time: 04m 33s 219ms, eta: 10h 21m 47s 030ms 2020-03-06T09:18:06 INFO: coco:, 36700/50000, train/total_loss: 2.0400 (2.0560), train/caption_cross_entropy: 2.0400 (2.0560), train/caption_bleu4: 0.2589 (0.2674), val/total_loss: 2.3090, val/caption_cross_entropy: 2.3090, val/caption_bleu4: 0.2496, max mem: 12698.0, lr: 0.00001, time: 04m 34s 753ms, eta: 10h 20m 36s 459ms 2020-03-06T09:22:41 INFO: coco:, 36800/50000, train/total_loss: 2.0635 (2.0560), train/caption_cross_entropy: 2.0635 (2.0560), train/caption_bleu4: 0.2671 (0.2674), val/total_loss: 2.4154, val/caption_cross_entropy: 2.4154, val/caption_bleu4: 0.1994, max mem: 12698.0, lr: 0.00001, time: 04m 35s 304ms, eta: 10h 17m 10s 640ms 2020-03-06T09:27:13 INFO: coco:, 36900/50000, train/total_loss: 2.0526 (2.0559), train/caption_cross_entropy: 2.0526 (2.0559), train/caption_bleu4: 0.2684 (0.2674), val/total_loss: 2.3086, val/caption_cross_entropy: 2.3086, val/caption_bleu4: 0.2281, max mem: 12698.0, lr: 0.00001, time: 04m 31s 496ms, eta: 10h 04m 01s 804ms 2020-03-06T09:31:47 INFO: coco:, 37000/50000, train/total_loss: 2.0499 (2.0558), train/caption_cross_entropy: 2.0499 (2.0558), train/caption_bleu4: 0.2680 (0.2674), val/total_loss: 2.2460, val/caption_cross_entropy: 2.2460, val/caption_bleu4: 0.2117, max mem: 12698.0, lr: 0.00001, time: 04m 39s 222ms, eta: 10h 16m 28s 620ms 2020-03-06T09:31:47 INFO: Evaluation time. Running on full validation set... 2020-03-06T09:32:09 INFO: coco: full val:, 37000/50000, val/total_loss: 2.3071, val/caption_cross_entropy: 2.3071, val/caption_bleu4: 0.2294, validation time: 45m 23s 710ms, best iteration: 24000, best val/caption_bleu4: 0.231642 2020-03-06T09:36:28 INFO: coco:, 37100/50000, train/total_loss: 2.0078 (2.0556), train/caption_cross_entropy: 2.0078 (2.0556), train/caption_bleu4: 0.2690 (0.2674), val/total_loss: 2.5025, val/caption_cross_entropy: 2.5025, val/caption_bleu4: 0.2053, max mem: 12698.0, lr: 0.00001, time: 04m 42s 475ms, eta: 10h 18m 51s 709ms 2020-03-06T09:40:58 INFO: coco:, 37200/50000, train/total_loss: 2.0671 (2.0556), train/caption_cross_entropy: 2.0671 (2.0556), train/caption_bleu4: 0.2727 (0.2674), val/total_loss: 2.3667, val/caption_cross_entropy: 2.3667, val/caption_bleu4: 0.2160, max mem: 12698.0, lr: 0.00001, time: 04m 28s 630ms, eta: 09h 43m 57s 974ms 2020-03-06T09:45:27 INFO: coco:, 37300/50000, train/total_loss: 2.0320 (2.0555), train/caption_cross_entropy: 2.0320 (2.0555), train/caption_bleu4: 0.2690 (0.2675), val/total_loss: 2.2737, val/caption_cross_entropy: 2.2737, val/caption_bleu4: 0.2493, max mem: 12698.0, lr: 0.00001, time: 04m 25s 248ms, eta: 09h 32m 06s 587ms 2020-03-06T09:49:59 INFO: coco:, 37400/50000, train/total_loss: 2.0310 (2.0553), train/caption_cross_entropy: 2.0310 (2.0553), train/caption_bleu4: 0.2641 (0.2675), val/total_loss: 2.2204, val/caption_cross_entropy: 2.2204, val/caption_bleu4: 0.2139, max mem: 12698.0, lr: 0.00001, time: 04m 33s 063ms, eta: 09h 44m 19s 783ms 2020-03-06T09:54:28 INFO: coco:, 37500/50000, train/total_loss: 2.0401 (2.0552), train/caption_cross_entropy: 2.0401 (2.0552), train/caption_bleu4: 0.2671 (0.2675), val/total_loss: 2.3651, val/caption_cross_entropy: 2.3651, val/caption_bleu4: 0.2382, max mem: 12698.0, lr: 0.00001, time: 04m 27s 636ms, eta: 09h 28m 10s 163ms 2020-03-06T09:59:00 INFO: coco:, 37600/50000, train/total_loss: 2.0370 (2.0552), train/caption_cross_entropy: 2.0370 (2.0552), train/caption_bleu4: 0.2696 (0.2675), val/total_loss: 2.3608, val/caption_cross_entropy: 2.3608, val/caption_bleu4: 0.2303, max mem: 12698.0, lr: 0.00001, time: 04m 31s 301ms, eta: 09h 31m 20s 602ms 2020-03-06T10:03:32 INFO: coco:, 37700/50000, train/total_loss: 2.0534 (2.0552), train/caption_cross_entropy: 2.0534 (2.0552), train/caption_bleu4: 0.2691 (0.2675), val/total_loss: 2.2888, val/caption_cross_entropy: 2.2888, val/caption_bleu4: 0.2328, max mem: 12698.0, lr: 0.00001, time: 04m 33s 084ms, eta: 09h 30m 27s 609ms 2020-03-06T10:08:05 INFO: coco:, 37800/50000, train/total_loss: 2.0512 (2.0551), train/caption_cross_entropy: 2.0512 (2.0551), train/caption_bleu4: 0.2685 (0.2675), val/total_loss: 2.2475, val/caption_cross_entropy: 2.2475, val/caption_bleu4: 0.2352, max mem: 12698.0, lr: 0.00001, time: 04m 32s 504ms, eta: 09h 24m 37s 250ms 2020-03-06T10:12:36 INFO: coco:, 37900/50000, train/total_loss: 2.0311 (2.0551), train/caption_cross_entropy: 2.0311 (2.0551), train/caption_bleu4: 0.2685 (0.2675), val/total_loss: 2.3697, val/caption_cross_entropy: 2.3697, val/caption_bleu4: 0.2285, max mem: 12698.0, lr: 0.00001, time: 04m 30s 378ms, eta: 09h 15m 37s 375ms 2020-03-06T10:17:03 INFO: coco:, 38000/50000, train/total_loss: 2.0306 (2.0550), train/caption_cross_entropy: 2.0306 (2.0550), train/caption_bleu4: 0.2706 (0.2676), val/total_loss: 2.3092, val/caption_cross_entropy: 2.3092, val/caption_bleu4: 0.2157, max mem: 12698.0, lr: 0.00001, time: 04m 27s 461ms, eta: 09h 05m 05s 215ms 2020-03-06T10:17:03 INFO: Evaluation time. Running on full validation set... 2020-03-06T10:17:25 INFO: coco: full val:, 38000/50000, val/total_loss: 2.3061, val/caption_cross_entropy: 2.3061, val/caption_bleu4: 0.2297, validation time: 45m 16s 552ms, best iteration: 24000, best val/caption_bleu4: 0.231642 2020-03-06T10:21:47 INFO: coco:, 38100/50000, train/total_loss: 2.0152 (2.0549), train/caption_cross_entropy: 2.0152 (2.0549), train/caption_bleu4: 0.2688 (0.2676), val/total_loss: 2.3908, val/caption_cross_entropy: 2.3908, val/caption_bleu4: 0.2161, max mem: 12698.0, lr: 0.00001, time: 04m 49s 555ms, eta: 09h 45m 11s 823ms 2020-03-06T10:26:11 INFO: coco:, 38200/50000, train/total_loss: 2.0224 (2.0548), train/caption_cross_entropy: 2.0224 (2.0548), train/caption_bleu4: 0.2683 (0.2676), val/total_loss: 2.2973, val/caption_cross_entropy: 2.2973, val/caption_bleu4: 0.2212, max mem: 12698.0, lr: 0.00001, time: 04m 23s 300ms, eta: 08h 47m 39s 774ms 2020-03-06T10:30:38 INFO: coco:, 38300/50000, train/total_loss: 2.0499 (2.0546), train/caption_cross_entropy: 2.0499 (2.0546), train/caption_bleu4: 0.2637 (0.2676), val/total_loss: 2.2131, val/caption_cross_entropy: 2.2131, val/caption_bleu4: 0.2480, max mem: 12698.0, lr: 0.00001, time: 04m 20s 835ms, eta: 08h 38m 17s 578ms 2020-03-06T10:35:12 INFO: coco:, 38400/50000, train/total_loss: 2.0370 (2.0546), train/caption_cross_entropy: 2.0370 (2.0546), train/caption_bleu4: 0.2624 (0.2676), val/total_loss: 2.3376, val/caption_cross_entropy: 2.3376, val/caption_bleu4: 0.2273, max mem: 12698.0, lr: 0.00001, time: 04m 36s 295ms, eta: 09h 04m 19s 209ms 2020-03-06T10:39:42 INFO: coco:, 38500/50000, train/total_loss: 2.0469 (2.0546), train/caption_cross_entropy: 2.0469 (2.0546), train/caption_bleu4: 0.2656 (0.2676), val/total_loss: 2.3655, val/caption_cross_entropy: 2.3655, val/caption_bleu4: 0.2193, max mem: 12698.0, lr: 0.00001, time: 04m 28s 668ms, eta: 08h 44m 43s 915ms 2020-03-06T10:44:16 INFO: coco:, 38600/50000, train/total_loss: 2.0270 (2.0545), train/caption_cross_entropy: 2.0270 (2.0545), train/caption_bleu4: 0.2658 (0.2676), val/total_loss: 2.3588, val/caption_cross_entropy: 2.3588, val/caption_bleu4: 0.2248, max mem: 12698.0, lr: 0.00001, time: 04m 39s 021ms, eta: 09h 12s 812ms 2020-03-06T10:48:45 INFO: coco:, 38700/50000, train/total_loss: 2.0469 (2.0545), train/caption_cross_entropy: 2.0469 (2.0545), train/caption_bleu4: 0.2695 (0.2676), val/total_loss: 2.2172, val/caption_cross_entropy: 2.2172, val/caption_bleu4: 0.2429, max mem: 12698.0, lr: 0.00001, time: 04m 27s 370ms, eta: 08h 33m 06s 881ms 2020-03-06T10:53:11 INFO: coco:, 38800/50000, train/total_loss: 2.0433 (2.0545), train/caption_cross_entropy: 2.0433 (2.0545), train/caption_bleu4: 0.2634 (0.2676), val/total_loss: 2.3337, val/caption_cross_entropy: 2.3337, val/caption_bleu4: 0.2320, max mem: 12698.0, lr: 0.00001, time: 04m 23s 821ms, eta: 08h 21m 49s 442ms 2020-03-06T10:57:37 INFO: coco:, 38900/50000, train/total_loss: 2.0374 (2.0543), train/caption_cross_entropy: 2.0374 (2.0543), train/caption_bleu4: 0.2704 (0.2676), val/total_loss: 2.3765, val/caption_cross_entropy: 2.3765, val/caption_bleu4: 0.2342, max mem: 12698.0, lr: 0.00001, time: 04m 25s 833ms, eta: 08h 21m 08s 146ms 2020-03-06T11:02:05 INFO: coco:, 39000/50000, train/total_loss: 2.0381 (2.0542), train/caption_cross_entropy: 2.0381 (2.0542), train/caption_bleu4: 0.2724 (0.2676), val/total_loss: 2.2944, val/caption_cross_entropy: 2.2944, val/caption_bleu4: 0.2098, max mem: 12698.0, lr: 0.00001, time: 04m 28s 136ms, eta: 08h 20m 55s 461ms 2020-03-06T11:02:05 INFO: Evaluation time. Running on full validation set... 2020-03-06T11:02:27 INFO: coco: full val:, 39000/50000, val/total_loss: 2.3108, val/caption_cross_entropy: 2.3108, val/caption_bleu4: 0.2288, validation time: 45m 01s 122ms, best iteration: 24000, best val/caption_bleu4: 0.231642 2020-03-06T11:06:49 INFO: coco:, 39100/50000, train/total_loss: 2.0385 (2.0541), train/caption_cross_entropy: 2.0385 (2.0541), train/caption_bleu4: 0.2658 (0.2676), val/total_loss: 2.3211, val/caption_cross_entropy: 2.3211, val/caption_bleu4: 0.2303, max mem: 12698.0, lr: 0.00001, time: 04m 49s 265ms, eta: 08h 55m 29s 053ms 2020-03-06T11:11:10 INFO: coco:, 39200/50000, train/total_loss: 2.0308 (2.0541), train/caption_cross_entropy: 2.0308 (2.0541), train/caption_bleu4: 0.2691 (0.2676), val/total_loss: 2.3574, val/caption_cross_entropy: 2.3574, val/caption_bleu4: 0.2482, max mem: 12698.0, lr: 0.00001, time: 04m 19s 695ms, eta: 07h 56m 20s 067ms 2020-03-06T11:15:37 INFO: coco:, 39300/50000, train/total_loss: 2.0544 (2.0540), train/caption_cross_entropy: 2.0544 (2.0540), train/caption_bleu4: 0.2769 (0.2676), val/total_loss: 2.3251, val/caption_cross_entropy: 2.3251, val/caption_bleu4: 0.2109, max mem: 12698.0, lr: 0.00001, time: 04m 22s 708ms, eta: 07h 57m 23s 880ms 2020-03-06T11:20:06 INFO: coco:, 39400/50000, train/total_loss: 2.0401 (2.0539), train/caption_cross_entropy: 2.0401 (2.0539), train/caption_bleu4: 0.2690 (0.2676), val/total_loss: 2.4073, val/caption_cross_entropy: 2.4073, val/caption_bleu4: 0.2194, max mem: 12698.0, lr: 0.00001, time: 04m 28s 239ms, eta: 08h 02m 53s 655ms 2020-03-06T11:24:32 INFO: coco:, 39500/50000, train/total_loss: 2.0548 (2.0539), train/caption_cross_entropy: 2.0548 (2.0539), train/caption_bleu4: 0.2616 (0.2676), val/total_loss: 2.3229, val/caption_cross_entropy: 2.3229, val/caption_bleu4: 0.2323, max mem: 12698.0, lr: 0.00001, time: 04m 27s 071ms, eta: 07h 56m 15s 341ms 2020-03-06T11:29:07 INFO: coco:, 39600/50000, train/total_loss: 2.0329 (2.0538), train/caption_cross_entropy: 2.0329 (2.0538), train/caption_bleu4: 0.2633 (0.2676), val/total_loss: 2.2256, val/caption_cross_entropy: 2.2256, val/caption_bleu4: 0.2566, max mem: 12698.0, lr: 0.00001, time: 04m 32s 488ms, eta: 08h 01m 17s 198ms 2020-03-06T11:33:37 INFO: coco:, 39700/50000, train/total_loss: 2.0432 (2.0538), train/caption_cross_entropy: 2.0432 (2.0538), train/caption_bleu4: 0.2687 (0.2677), val/total_loss: 2.2997, val/caption_cross_entropy: 2.2997, val/caption_bleu4: 0.2223, max mem: 12698.0, lr: 0.00001, time: 04m 31s 198ms, eta: 07h 54m 24s 161ms 2020-03-06T11:38:02 INFO: coco:, 39800/50000, train/total_loss: 2.0408 (2.0537), train/caption_cross_entropy: 2.0408 (2.0537), train/caption_bleu4: 0.2664 (0.2677), val/total_loss: 2.3734, val/caption_cross_entropy: 2.3734, val/caption_bleu4: 0.2213, max mem: 12698.0, lr: 0.00001, time: 04m 25s 850ms, eta: 07h 40m 31s 939ms 2020-03-06T11:42:28 INFO: coco:, 39900/50000, train/total_loss: 2.0502 (2.0536), train/caption_cross_entropy: 2.0502 (2.0536), train/caption_bleu4: 0.2685 (0.2677), val/total_loss: 2.3205, val/caption_cross_entropy: 2.3205, val/caption_bleu4: 0.2351, max mem: 12698.0, lr: 0.00001, time: 04m 24s 730ms, eta: 07h 34m 05s 847ms 2020-03-06T11:46:52 INFO: coco:, 40000/50000, train/total_loss: 2.0435 (2.0535), train/caption_cross_entropy: 2.0435 (2.0535), train/caption_bleu4: 0.2700 (0.2677), val/total_loss: 2.2568, val/caption_cross_entropy: 2.2568, val/caption_bleu4: 0.2369, max mem: 12698.0, lr: 0.00001, time: 04m 25s 115ms, eta: 07h 30m 15s 253ms 2020-03-06T11:46:52 INFO: Evaluation time. Running on full validation set... 2020-03-06T11:47:14 INFO: coco: full val:, 40000/50000, val/total_loss: 2.3083, val/caption_cross_entropy: 2.3083, val/caption_bleu4: 0.2292, validation time: 44m 47s 716ms, best iteration: 24000, best val/caption_bleu4: 0.231642 2020-03-06T11:51:40 INFO: coco:, 40100/50000, train/total_loss: 2.0655 (2.0535), train/caption_cross_entropy: 2.0655 (2.0535), train/caption_bleu4: 0.2601 (0.2677), val/total_loss: 2.2566, val/caption_cross_entropy: 2.2566, val/caption_bleu4: 0.2260, max mem: 12698.0, lr: 0.00001, time: 04m 53s 401ms, eta: 08h 13m 18s 601ms 2020-03-06T11:56:10 INFO: coco:, 40200/50000, train/total_loss: 2.0610 (2.0535), train/caption_cross_entropy: 2.0610 (2.0535), train/caption_bleu4: 0.2682 (0.2677), val/total_loss: 2.3702, val/caption_cross_entropy: 2.3702, val/caption_bleu4: 0.2379, max mem: 12698.0, lr: 0.00001, time: 04m 30s 026ms, eta: 07h 29m 25s 378ms 2020-03-06T12:00:35 INFO: coco:, 40300/50000, train/total_loss: 2.0356 (2.0534), train/caption_cross_entropy: 2.0356 (2.0534), train/caption_bleu4: 0.2692 (0.2678), val/total_loss: 2.3965, val/caption_cross_entropy: 2.3965, val/caption_bleu4: 0.2211, max mem: 12698.0, lr: 0.00001, time: 04m 22s 163ms, eta: 07h 11m 53s 074ms 2020-03-06T12:05:07 INFO: coco:, 40400/50000, train/total_loss: 2.0473 (2.0533), train/caption_cross_entropy: 2.0473 (2.0533), train/caption_bleu4: 0.2695 (0.2678), val/total_loss: 2.3565, val/caption_cross_entropy: 2.3565, val/caption_bleu4: 0.2236, max mem: 12698.0, lr: 0.00001, time: 04m 30s 233ms, eta: 07h 20m 35s 300ms 2020-03-06T12:09:31 INFO: coco:, 40500/50000, train/total_loss: 2.0342 (2.0533), train/caption_cross_entropy: 2.0342 (2.0533), train/caption_bleu4: 0.2720 (0.2678), val/total_loss: 2.3486, val/caption_cross_entropy: 2.3486, val/caption_bleu4: 0.2313, max mem: 12698.0, lr: 0.00001, time: 04m 24s 549ms, eta: 07h 06m 49s 688ms 2020-03-06T12:13:57 INFO: coco:, 40600/50000, train/total_loss: 2.0312 (2.0532), train/caption_cross_entropy: 2.0312 (2.0532), train/caption_bleu4: 0.2652 (0.2678), val/total_loss: 2.3595, val/caption_cross_entropy: 2.3595, val/caption_bleu4: 0.2079, max mem: 12698.0, lr: 0.00001, time: 04m 23s 501ms, eta: 07h 39s 792ms 2020-03-06T12:18:24 INFO: coco:, 40700/50000, train/total_loss: 2.0400 (2.0532), train/caption_cross_entropy: 2.0400 (2.0532), train/caption_bleu4: 0.2753 (0.2678), val/total_loss: 2.3075, val/caption_cross_entropy: 2.3075, val/caption_bleu4: 0.2326, max mem: 12698.0, lr: 0.00001, time: 04m 28s 505ms, eta: 07h 04m 05s 423ms 2020-03-06T12:22:49 INFO: coco:, 40800/50000, train/total_loss: 2.0508 (2.0531), train/caption_cross_entropy: 2.0508 (2.0531), train/caption_bleu4: 0.2671 (0.2678), val/total_loss: 2.1436, val/caption_cross_entropy: 2.1436, val/caption_bleu4: 0.2703, max mem: 12698.0, lr: 0.00001, time: 04m 25s 629ms, eta: 06h 55m 02s 247ms 2020-03-06T12:27:20 INFO: coco:, 40900/50000, train/total_loss: 2.0339 (2.0530), train/caption_cross_entropy: 2.0339 (2.0530), train/caption_bleu4: 0.2678 (0.2678), val/total_loss: 2.2036, val/caption_cross_entropy: 2.2036, val/caption_bleu4: 0.2423, max mem: 12698.0, lr: 0.00001, time: 04m 29s 296ms, eta: 06h 56m 11s 580ms 2020-03-06T12:31:45 INFO: coco:, 41000/50000, train/total_loss: 2.0595 (2.0530), train/caption_cross_entropy: 2.0595 (2.0530), train/caption_bleu4: 0.2663 (0.2678), val/total_loss: 2.3232, val/caption_cross_entropy: 2.3232, val/caption_bleu4: 0.2333, max mem: 12698.0, lr: 0.00001, time: 04m 25s 704ms, eta: 06h 46m 07s 746ms 2020-03-06T12:31:45 INFO: Evaluation time. Running on full validation set... 2020-03-06T12:32:06 INFO: coco: full val:, 41000/50000, val/total_loss: 2.3112, val/caption_cross_entropy: 2.3112, val/caption_bleu4: 0.2293, validation time: 44m 51s 976ms, best iteration: 24000, best val/caption_bleu4: 0.231642 2020-03-06T12:36:27 INFO: coco:, 41100/50000, train/total_loss: 2.0428 (2.0529), train/caption_cross_entropy: 2.0428 (2.0529), train/caption_bleu4: 0.2622 (0.2678), val/total_loss: 2.2367, val/caption_cross_entropy: 2.2367, val/caption_bleu4: 0.2481, max mem: 12698.0, lr: 0.00001, time: 04m 48s 492ms, eta: 07h 16m 03s 698ms 2020-03-06T12:40:52 INFO: coco:, 41200/50000, train/total_loss: 2.0360 (2.0529), train/caption_cross_entropy: 2.0360 (2.0529), train/caption_bleu4: 0.2706 (0.2678), val/total_loss: 2.2283, val/caption_cross_entropy: 2.2283, val/caption_bleu4: 0.2384, max mem: 12698.0, lr: 0.00001, time: 04m 22s 461ms, eta: 06h 32m 15s 413ms 2020-03-06T12:45:21 INFO: coco:, 41300/50000, train/total_loss: 2.0387 (2.0528), train/caption_cross_entropy: 2.0387 (2.0528), train/caption_bleu4: 0.2669 (0.2678), val/total_loss: 2.3303, val/caption_cross_entropy: 2.3303, val/caption_bleu4: 0.2143, max mem: 12698.0, lr: 0.00001, time: 04m 24s 035ms, eta: 06h 30m 07s 582ms 2020-03-06T12:49:44 INFO: coco:, 41400/50000, train/total_loss: 2.0476 (2.0527), train/caption_cross_entropy: 2.0476 (2.0527), train/caption_bleu4: 0.2605 (0.2679), val/total_loss: 2.2677, val/caption_cross_entropy: 2.2677, val/caption_bleu4: 0.2230, max mem: 12698.0, lr: 0.00001, time: 04m 25s 417ms, eta: 06h 27m 39s 628ms 2020-03-06T12:54:16 INFO: coco:, 41500/50000, train/total_loss: 2.0458 (2.0527), train/caption_cross_entropy: 2.0458 (2.0527), train/caption_bleu4: 0.2685 (0.2679), val/total_loss: 2.2706, val/caption_cross_entropy: 2.2706, val/caption_bleu4: 0.2436, max mem: 12698.0, lr: 0.00001, time: 04m 29s 799ms, eta: 06h 29m 28s 706ms 2020-03-06T12:58:45 INFO: coco:, 41600/50000, train/total_loss: 2.0147 (2.0526), train/caption_cross_entropy: 2.0147 (2.0526), train/caption_bleu4: 0.2711 (0.2679), val/total_loss: 2.3194, val/caption_cross_entropy: 2.3194, val/caption_bleu4: 0.2300, max mem: 12698.0, lr: 0.00001, time: 04m 28s 101ms, eta: 06h 22m 28s 433ms 2020-03-06T13:03:15 INFO: coco:, 41700/50000, train/total_loss: 2.0495 (2.0526), train/caption_cross_entropy: 2.0495 (2.0526), train/caption_bleu4: 0.2729 (0.2679), val/total_loss: 2.2454, val/caption_cross_entropy: 2.2454, val/caption_bleu4: 0.2434, max mem: 12698.0, lr: 0.00001, time: 04m 31s 652ms, eta: 06h 22m 55s 528ms 2020-03-06T13:07:46 INFO: coco:, 41800/50000, train/total_loss: 2.0397 (2.0526), train/caption_cross_entropy: 2.0397 (2.0526), train/caption_bleu4: 0.2624 (0.2679), val/total_loss: 2.2937, val/caption_cross_entropy: 2.2937, val/caption_bleu4: 0.2062, max mem: 12698.0, lr: 0.00001, time: 04m 33s 404ms, eta: 06h 20m 45s 170ms 2020-03-06T13:12:08 INFO: coco:, 41900/50000, train/total_loss: 2.0590 (2.0526), train/caption_cross_entropy: 2.0590 (2.0526), train/caption_bleu4: 0.2681 (0.2679), val/total_loss: 2.2527, val/caption_cross_entropy: 2.2527, val/caption_bleu4: 0.2319, max mem: 12698.0, lr: 0.00001, time: 04m 23s 671ms, eta: 06h 02m 43s 190ms 2020-03-06T13:16:36 INFO: coco:, 42000/50000, train/total_loss: 2.0601 (2.0525), train/caption_cross_entropy: 2.0601 (2.0525), train/caption_bleu4: 0.2631 (0.2679), val/total_loss: 2.3101, val/caption_cross_entropy: 2.3101, val/caption_bleu4: 0.2315, max mem: 12698.0, lr: 0.00001, time: 04m 24s 479ms, eta: 05h 59m 20s 404ms 2020-03-06T13:16:36 INFO: Evaluation time. Running on full validation set... 2020-03-06T13:16:57 INFO: coco: full val:, 42000/50000, val/total_loss: 2.3096, val/caption_cross_entropy: 2.3096, val/caption_bleu4: 0.2288, validation time: 44m 50s 732ms, best iteration: 24000, best val/caption_bleu4: 0.231642 2020-03-06T13:21:17 INFO: coco:, 42100/50000, train/total_loss: 2.0469 (2.0524), train/caption_cross_entropy: 2.0469 (2.0524), train/caption_bleu4: 0.2656 (0.2679), val/total_loss: 2.4250, val/caption_cross_entropy: 2.4250, val/caption_bleu4: 0.2064, max mem: 12698.0, lr: 0.00001, time: 04m 46s 578ms, eta: 06h 24m 29s 886ms 2020-03-06T13:25:41 INFO: coco:, 42200/50000, train/total_loss: 2.0517 (2.0524), train/caption_cross_entropy: 2.0517 (2.0524), train/caption_bleu4: 0.2732 (0.2679), val/total_loss: 2.2505, val/caption_cross_entropy: 2.2505, val/caption_bleu4: 0.2436, max mem: 12698.0, lr: 0.00001, time: 04m 22s 831ms, eta: 05h 48m 10s 388ms 2020-03-06T13:30:08 INFO: coco:, 42300/50000, train/total_loss: 2.0540 (2.0524), train/caption_cross_entropy: 2.0540 (2.0524), train/caption_bleu4: 0.2647 (0.2679), val/total_loss: 2.2794, val/caption_cross_entropy: 2.2794, val/caption_bleu4: 0.2304, max mem: 12698.0, lr: 0.00001, time: 04m 21s 239ms, eta: 05h 41m 37s 599ms 2020-03-06T13:34:33 INFO: coco:, 42400/50000, train/total_loss: 2.0556 (2.0523), train/caption_cross_entropy: 2.0556 (2.0523), train/caption_bleu4: 0.2669 (0.2679), val/total_loss: 2.3616, val/caption_cross_entropy: 2.3616, val/caption_bleu4: 0.2435, max mem: 12698.0, lr: 0.00001, time: 04m 27s 040ms, eta: 05h 44m 40s 674ms 2020-03-06T13:39:00 INFO: coco:, 42500/50000, train/total_loss: 2.0413 (2.0523), train/caption_cross_entropy: 2.0413 (2.0523), train/caption_bleu4: 0.2627 (0.2680), val/total_loss: 2.2770, val/caption_cross_entropy: 2.2770, val/caption_bleu4: 0.2376, max mem: 12698.0, lr: 0.00001, time: 04m 26s 195ms, eta: 05h 39m 03s 990ms 2020-03-06T13:43:27 INFO: coco:, 42600/50000, train/total_loss: 2.0166 (2.0522), train/caption_cross_entropy: 2.0166 (2.0522), train/caption_bleu4: 0.2721 (0.2679), val/total_loss: 2.3102, val/caption_cross_entropy: 2.3102, val/caption_bleu4: 0.2306, max mem: 12698.0, lr: 0.00001, time: 04m 25s 911ms, eta: 05h 34m 11s 290ms 2020-03-06T13:47:57 INFO: coco:, 42700/50000, train/total_loss: 2.0324 (2.0521), train/caption_cross_entropy: 2.0324 (2.0521), train/caption_bleu4: 0.2742 (0.2680), val/total_loss: 2.2583, val/caption_cross_entropy: 2.2583, val/caption_bleu4: 0.2129, max mem: 12698.0, lr: 0.00001, time: 04m 28s 836ms, eta: 05h 33m 17s 947ms 2020-03-06T13:52:28 INFO: coco:, 42800/50000, train/total_loss: 2.0386 (2.0521), train/caption_cross_entropy: 2.0386 (2.0521), train/caption_bleu4: 0.2651 (0.2680), val/total_loss: 2.2868, val/caption_cross_entropy: 2.2868, val/caption_bleu4: 0.2464, max mem: 12698.0, lr: 0.00001, time: 04m 32s 654ms, eta: 05h 33m 24s 105ms 2020-03-06T13:56:56 INFO: coco:, 42900/50000, train/total_loss: 2.0367 (2.0520), train/caption_cross_entropy: 2.0367 (2.0520), train/caption_bleu4: 0.2610 (0.2680), val/total_loss: 2.3147, val/caption_cross_entropy: 2.3147, val/caption_bleu4: 0.2107, max mem: 12698.0, lr: 0.00001, time: 04m 28s 004ms, eta: 05h 23m 09s 838ms 2020-03-06T14:01:26 INFO: coco:, 43000/50000, train/total_loss: 2.0404 (2.0520), train/caption_cross_entropy: 2.0404 (2.0520), train/caption_bleu4: 0.2686 (0.2680), val/total_loss: 2.2407, val/caption_cross_entropy: 2.2407, val/caption_bleu4: 0.2344, max mem: 12698.0, lr: 0.00001, time: 04m 29s 133ms, eta: 05h 19m 57s 262ms 2020-03-06T14:01:26 INFO: Evaluation time. Running on full validation set... 2020-03-06T14:01:47 INFO: coco: full val:, 43000/50000, val/total_loss: 2.3066, val/caption_cross_entropy: 2.3066, val/caption_bleu4: 0.2290, validation time: 44m 49s 592ms, best iteration: 24000, best val/caption_bleu4: 0.231642 2020-03-06T14:06:10 INFO: coco:, 43100/50000, train/total_loss: 2.0555 (2.0520), train/caption_cross_entropy: 2.0555 (2.0520), train/caption_bleu4: 0.2746 (0.2680), val/total_loss: 2.2030, val/caption_cross_entropy: 2.2030, val/caption_bleu4: 0.2595, max mem: 12698.0, lr: 0.00001, time: 04m 50s 485ms, eta: 05h 40m 24s 353ms 2020-03-06T14:10:31 INFO: coco:, 43200/50000, train/total_loss: 2.0339 (2.0520), train/caption_cross_entropy: 2.0339 (2.0520), train/caption_bleu4: 0.2657 (0.2680), val/total_loss: 2.2182, val/caption_cross_entropy: 2.2182, val/caption_bleu4: 0.2322, max mem: 12698.0, lr: 0.00001, time: 04m 19s 857ms, eta: 05h 06s 047ms 2020-03-06T14:14:53 INFO: coco:, 43300/50000, train/total_loss: 2.0606 (2.0520), train/caption_cross_entropy: 2.0606 (2.0520), train/caption_bleu4: 0.2687 (0.2680), val/total_loss: 2.4066, val/caption_cross_entropy: 2.4066, val/caption_bleu4: 0.2119, max mem: 12698.0, lr: 0.00001, time: 04m 17s 779ms, eta: 04h 53m 19s 367ms 2020-03-06T14:19:20 INFO: coco:, 43400/50000, train/total_loss: 2.0479 (2.0519), train/caption_cross_entropy: 2.0479 (2.0519), train/caption_bleu4: 0.2736 (0.2680), val/total_loss: 2.3452, val/caption_cross_entropy: 2.3452, val/caption_bleu4: 0.2418, max mem: 12698.0, lr: 0.00001, time: 04m 26s 377ms, eta: 04h 58m 34s 983ms 2020-03-06T14:23:52 INFO: coco:, 43500/50000, train/total_loss: 2.0499 (2.0518), train/caption_cross_entropy: 2.0499 (2.0518), train/caption_bleu4: 0.2609 (0.2680), val/total_loss: 2.2312, val/caption_cross_entropy: 2.2312, val/caption_bleu4: 0.2392, max mem: 12698.0, lr: 0.00001, time: 04m 35s 068ms, eta: 05h 03m 39s 141ms 2020-03-06T14:28:20 INFO: coco:, 43600/50000, train/total_loss: 2.0488 (2.0518), train/caption_cross_entropy: 2.0488 (2.0518), train/caption_bleu4: 0.2651 (0.2680), val/total_loss: 2.3185, val/caption_cross_entropy: 2.3185, val/caption_bleu4: 0.2066, max mem: 12698.0, lr: 0.00001, time: 04m 28s 316ms, eta: 04h 51m 38s 509ms 2020-03-06T14:32:47 INFO: coco:, 43700/50000, train/total_loss: 2.0333 (2.0517), train/caption_cross_entropy: 2.0333 (2.0517), train/caption_bleu4: 0.2660 (0.2680), val/total_loss: 2.2715, val/caption_cross_entropy: 2.2715, val/caption_bleu4: 0.2378, max mem: 12698.0, lr: 0.00001, time: 04m 24s 464ms, eta: 04h 42m 57s 861ms 2020-03-06T14:37:15 INFO: coco:, 43800/50000, train/total_loss: 2.0294 (2.0517), train/caption_cross_entropy: 2.0294 (2.0517), train/caption_bleu4: 0.2706 (0.2680), val/total_loss: 2.2381, val/caption_cross_entropy: 2.2381, val/caption_bleu4: 0.2507, max mem: 12698.0, lr: 0.00001, time: 04m 27s 760ms, eta: 04h 41m 56s 565ms 2020-03-06T14:41:46 INFO: coco:, 43900/50000, train/total_loss: 2.0321 (2.0517), train/caption_cross_entropy: 2.0321 (2.0517), train/caption_bleu4: 0.2739 (0.2680), val/total_loss: 2.3914, val/caption_cross_entropy: 2.3914, val/caption_bleu4: 0.2115, max mem: 12698.0, lr: 0.00001, time: 04m 31s 063ms, eta: 04h 40m 49s 011ms 2020-03-06T14:46:13 INFO: coco:, 44000/50000, train/total_loss: 2.0467 (2.0516), train/caption_cross_entropy: 2.0467 (2.0516), train/caption_bleu4: 0.2646 (0.2680), val/total_loss: 2.3464, val/caption_cross_entropy: 2.3464, val/caption_bleu4: 0.2338, max mem: 12698.0, lr: 0.00001, time: 04m 26s 484ms, eta: 04h 31m 32s 852ms 2020-03-06T14:46:13 INFO: Evaluation time. Running on full validation set... 2020-03-06T14:46:35 INFO: coco: full val:, 44000/50000, val/total_loss: 2.3101, val/caption_cross_entropy: 2.3101, val/caption_bleu4: 0.2280, validation time: 44m 47s 892ms, best iteration: 24000, best val/caption_bleu4: 0.231642 2020-03-06T14:50:58 INFO: coco:, 44100/50000, train/total_loss: 2.0497 (2.0516), train/caption_cross_entropy: 2.0497 (2.0516), train/caption_bleu4: 0.2634 (0.2680), val/total_loss: 2.5179, val/caption_cross_entropy: 2.5179, val/caption_bleu4: 0.2121, max mem: 12698.0, lr: 0.00001, time: 04m 51s 217ms, eta: 04h 51m 48s 312ms 2020-03-06T14:55:23 INFO: coco:, 44200/50000, train/total_loss: 2.0497 (2.0515), train/caption_cross_entropy: 2.0497 (2.0515), train/caption_bleu4: 0.2647 (0.2680), val/total_loss: 2.2504, val/caption_cross_entropy: 2.2504, val/caption_bleu4: 0.2320, max mem: 12698.0, lr: 0.00001, time: 04m 24s 699ms, eta: 04h 20m 44s 267ms 2020-03-06T14:59:51 INFO: coco:, 44300/50000, train/total_loss: 2.0417 (2.0515), train/caption_cross_entropy: 2.0417 (2.0515), train/caption_bleu4: 0.2699 (0.2680), val/total_loss: 2.4500, val/caption_cross_entropy: 2.4500, val/caption_bleu4: 0.2396, max mem: 12698.0, lr: 0.00001, time: 04m 20s 275ms, eta: 04h 11m 57s 558ms 2020-03-06T15:04:19 INFO: coco:, 44400/50000, train/total_loss: 2.0356 (2.0515), train/caption_cross_entropy: 2.0356 (2.0515), train/caption_bleu4: 0.2625 (0.2680), val/total_loss: 2.2651, val/caption_cross_entropy: 2.2651, val/caption_bleu4: 0.2264, max mem: 12698.0, lr: 0.00001, time: 04m 31s 228ms, eta: 04h 17m 57s 373ms 2020-03-06T15:08:44 INFO: coco:, 44500/50000, train/total_loss: 2.0283 (2.0514), train/caption_cross_entropy: 2.0283 (2.0514), train/caption_bleu4: 0.2783 (0.2681), val/total_loss: 2.2640, val/caption_cross_entropy: 2.2640, val/caption_bleu4: 0.2341, max mem: 12698.0, lr: 0.00001, time: 04m 23s 519ms, eta: 04h 06m 08s 930ms 2020-03-06T15:13:11 INFO: coco:, 44600/50000, train/total_loss: 2.0365 (2.0514), train/caption_cross_entropy: 2.0365 (2.0514), train/caption_bleu4: 0.2631 (0.2681), val/total_loss: 2.2507, val/caption_cross_entropy: 2.2507, val/caption_bleu4: 0.2458, max mem: 12698.0, lr: 0.00001, time: 04m 26s 310ms, eta: 04h 04m 14s 027ms 2020-03-06T15:18:02 INFO: coco:, 44700/50000, train/total_loss: 2.0387 (2.0513), train/caption_cross_entropy: 2.0387 (2.0513), train/caption_bleu4: 0.2643 (0.2681), val/total_loss: 2.3412, val/caption_cross_entropy: 2.3412, val/caption_bleu4: 0.2344, max mem: 12698.0, lr: 0.00001, time: 04m 52s 002ms, eta: 04h 22m 50s 195ms 2020-03-06T15:22:30 INFO: coco:, 44800/50000, train/total_loss: 2.0171 (2.0513), train/caption_cross_entropy: 2.0171 (2.0513), train/caption_bleu4: 0.2767 (0.2681), val/total_loss: 2.3574, val/caption_cross_entropy: 2.3574, val/caption_bleu4: 0.2269, max mem: 12698.0, lr: 0.00001, time: 04m 27s 183ms, eta: 03h 55m 57s 529ms 2020-03-06T15:26:58 INFO: coco:, 44900/50000, train/total_loss: 2.0357 (2.0513), train/caption_cross_entropy: 2.0357 (2.0513), train/caption_bleu4: 0.2710 (0.2681), val/total_loss: 2.2820, val/caption_cross_entropy: 2.2820, val/caption_bleu4: 0.2215, max mem: 12698.0, lr: 0.00001, time: 04m 27s 834ms, eta: 03h 51m 59s 096ms 2020-03-06T15:31:39 INFO: coco:, 45000/50000, train/total_loss: 2.0371 (2.0512), train/caption_cross_entropy: 2.0371 (2.0512), train/caption_bleu4: 0.2669 (0.2681), val/total_loss: 2.3690, val/caption_cross_entropy: 2.3690, val/caption_bleu4: 0.2225, max mem: 12698.0, lr: 0., time: 04m 41s 226ms, eta: 03h 58m 48s 516ms 2020-03-06T15:31:39 INFO: Evaluation time. Running on full validation set... 2020-03-06T15:32:01 INFO: coco: full val:, 45000/50000, val/total_loss: 2.3134, val/caption_cross_entropy: 2.3134, val/caption_bleu4: 0.2285, validation time: 45m 26s 288ms, best iteration: 24000, best val/caption_bleu4: 0.231642 2020-03-06T15:36:31 INFO: coco:, 45100/50000, train/total_loss: 2.0449 (2.0512), train/caption_cross_entropy: 2.0449 (2.0512), train/caption_bleu4: 0.2663 (0.2681), val/total_loss: 2.2567, val/caption_cross_entropy: 2.2567, val/caption_bleu4: 0.2324, max mem: 12698.0, lr: 0., time: 04m 59s 208ms, eta: 04h 08m 59s 791ms 2020-03-06T15:40:52 INFO: coco:, 45200/50000, train/total_loss: 2.0443 (2.0511), train/caption_cross_entropy: 2.0443 (2.0511), train/caption_bleu4: 0.2749 (0.2681), val/total_loss: 2.2587, val/caption_cross_entropy: 2.2587, val/caption_bleu4: 0.2333, max mem: 12698.0, lr: 0., time: 04m 18s 646ms, eta: 03h 30m 50s 903ms 2020-03-06T15:45:16 INFO: coco:, 45300/50000, train/total_loss: 2.0409 (2.0511), train/caption_cross_entropy: 2.0409 (2.0511), train/caption_bleu4: 0.2717 (0.2681), val/total_loss: 2.3457, val/caption_cross_entropy: 2.3457, val/caption_bleu4: 0.2100, max mem: 12698.0, lr: 0., time: 04m 20s 971ms, eta: 03h 28m 18s 729ms 2020-03-06T15:49:43 INFO: coco:, 45400/50000, train/total_loss: 2.0466 (2.0511), train/caption_cross_entropy: 2.0466 (2.0511), train/caption_bleu4: 0.2731 (0.2681), val/total_loss: 2.2901, val/caption_cross_entropy: 2.2901, val/caption_bleu4: 0.2239, max mem: 12698.0, lr: 0., time: 04m 27s 654ms, eta: 03h 29m 06s 038ms 2020-03-06T15:54:16 INFO: coco:, 45500/50000, train/total_loss: 2.0501 (2.0510), train/caption_cross_entropy: 2.0501 (2.0510), train/caption_bleu4: 0.2581 (0.2681), val/total_loss: 2.4376, val/caption_cross_entropy: 2.4376, val/caption_bleu4: 0.2309, max mem: 12698.0, lr: 0., time: 04m 31s 886ms, eta: 03h 27m 47s 348ms 2020-03-06T15:58:47 INFO: coco:, 45600/50000, train/total_loss: 2.0391 (2.0510), train/caption_cross_entropy: 2.0391 (2.0510), train/caption_bleu4: 0.2676 (0.2681), val/total_loss: 2.3275, val/caption_cross_entropy: 2.3275, val/caption_bleu4: 0.2248, max mem: 12698.0, lr: 0., time: 04m 29s 824ms, eta: 03h 21m 37s 846ms 2020-03-06T16:03:23 INFO: coco:, 45700/50000, train/total_loss: 2.0298 (2.0509), train/caption_cross_entropy: 2.0298 (2.0509), train/caption_bleu4: 0.2728 (0.2681), val/total_loss: 2.3654, val/caption_cross_entropy: 2.3654, val/caption_bleu4: 0.2109, max mem: 12698.0, lr: 0., time: 04m 36s 812ms, eta: 03h 22m 09s 089ms 2020-03-06T16:07:59 INFO: coco:, 45800/50000, train/total_loss: 2.0648 (2.0509), train/caption_cross_entropy: 2.0648 (2.0509), train/caption_bleu4: 0.2697 (0.2682), val/total_loss: 2.2955, val/caption_cross_entropy: 2.2955, val/caption_bleu4: 0.2255, max mem: 12698.0, lr: 0., time: 04m 35s 319ms, eta: 03h 16m 23s 110ms 2020-03-06T16:12:31 INFO: coco:, 45900/50000, train/total_loss: 2.0128 (2.0509), train/caption_cross_entropy: 2.0128 (2.0509), train/caption_bleu4: 0.2632 (0.2682), val/total_loss: 2.3751, val/caption_cross_entropy: 2.3751, val/caption_bleu4: 0.2054, max mem: 12698.0, lr: 0., time: 04m 32s 440ms, eta: 03h 09m 42s 302ms 2020-03-06T16:17:01 INFO: coco:, 46000/50000, train/total_loss: 2.0724 (2.0509), train/caption_cross_entropy: 2.0724 (2.0509), train/caption_bleu4: 0.2605 (0.2682), val/total_loss: 2.2203, val/caption_cross_entropy: 2.2203, val/caption_bleu4: 0.2436, max mem: 12698.0, lr: 0., time: 04m 30s 060ms, eta: 03h 03m 27s 666ms 2020-03-06T16:17:01 INFO: Evaluation time. Running on full validation set... 2020-03-06T16:17:23 INFO: coco: full val:, 46000/50000, val/total_loss: 2.3091, val/caption_cross_entropy: 2.3091, val/caption_bleu4: 0.2290, validation time: 45m 21s 672ms, best iteration: 24000, best val/caption_bleu4: 0.231642 2020-03-06T16:21:50 INFO: coco:, 46100/50000, train/total_loss: 2.0499 (2.0509), train/caption_cross_entropy: 2.0499 (2.0509), train/caption_bleu4: 0.2689 (0.2682), val/total_loss: 2.3275, val/caption_cross_entropy: 2.3275, val/caption_bleu4: 0.2428, max mem: 12698.0, lr: 0., time: 04m 54s 486ms, eta: 03h 15m 03s 185ms 2020-03-06T16:26:13 INFO: coco:, 46200/50000, train/total_loss: 2.0591 (2.0508), train/caption_cross_entropy: 2.0591 (2.0508), train/caption_bleu4: 0.2730 (0.2682), val/total_loss: 2.3122, val/caption_cross_entropy: 2.3122, val/caption_bleu4: 0.2253, max mem: 12698.0, lr: 0., time: 04m 22s 641ms, eta: 02h 49m 30s 006ms 2020-03-06T16:30:46 INFO: coco:, 46300/50000, train/total_loss: 2.0660 (2.0508), train/caption_cross_entropy: 2.0660 (2.0508), train/caption_bleu4: 0.2568 (0.2682), val/total_loss: 2.3371, val/caption_cross_entropy: 2.3371, val/caption_bleu4: 0.2180, max mem: 12698.0, lr: 0., time: 04m 25s 560ms, eta: 02h 46m 52s 429ms 2020-03-06T16:35:15 INFO: coco:, 46400/50000, train/total_loss: 2.0112 (2.0507), train/caption_cross_entropy: 2.0112 (2.0507), train/caption_bleu4: 0.2687 (0.2682), val/total_loss: 2.4485, val/caption_cross_entropy: 2.4485, val/caption_bleu4: 0.2053, max mem: 12698.0, lr: 0., time: 04m 31s 664ms, eta: 02h 46m 05s 722ms 2020-03-06T16:39:51 INFO: coco:, 46500/50000, train/total_loss: 2.0565 (2.0507), train/caption_cross_entropy: 2.0565 (2.0507), train/caption_bleu4: 0.2692 (0.2682), val/total_loss: 2.2220, val/caption_cross_entropy: 2.2220, val/caption_bleu4: 0.2419, max mem: 12698.0, lr: 0., time: 04m 35s 637ms, eta: 02h 43m 50s 618ms 2020-03-06T16:44:25 INFO: coco:, 46600/50000, train/total_loss: 2.0404 (2.0507), train/caption_cross_entropy: 2.0404 (2.0507), train/caption_bleu4: 0.2582 (0.2682), val/total_loss: 2.4295, val/caption_cross_entropy: 2.4295, val/caption_bleu4: 0.2209, max mem: 12698.0, lr: 0., time: 04m 32s 477ms, eta: 02h 37m 20s 261ms 2020-03-06T16:49:00 INFO: coco:, 46700/50000, train/total_loss: 2.0263 (2.0506), train/caption_cross_entropy: 2.0263 (2.0506), train/caption_bleu4: 0.2804 (0.2682), val/total_loss: 2.2952, val/caption_cross_entropy: 2.2952, val/caption_bleu4: 0.2325, max mem: 12698.0, lr: 0., time: 04m 39s 110ms, eta: 02h 36m 25s 647ms 2020-03-06T16:53:24 INFO: coco:, 46800/50000, train/total_loss: 2.0172 (2.0506), train/caption_cross_entropy: 2.0172 (2.0506), train/caption_bleu4: 0.2728 (0.2682), val/total_loss: 2.4757, val/caption_cross_entropy: 2.4757, val/caption_bleu4: 0.2006, max mem: 12698.0, lr: 0., time: 04m 24s 827ms, eta: 02h 23m 55s 511ms 2020-03-06T16:57:49 INFO: coco:, 46900/50000, train/total_loss: 2.0201 (2.0506), train/caption_cross_entropy: 2.0201 (2.0506), train/caption_bleu4: 0.2703 (0.2682), val/total_loss: 2.2560, val/caption_cross_entropy: 2.2560, val/caption_bleu4: 0.2348, max mem: 12698.0, lr: 0., time: 04m 21s 442ms, eta: 02h 17m 38s 700ms 2020-03-06T17:02:21 INFO: coco:, 47000/50000, train/total_loss: 2.0405 (2.0505), train/caption_cross_entropy: 2.0405 (2.0505), train/caption_bleu4: 0.2713 (0.2682), val/total_loss: 2.3542, val/caption_cross_entropy: 2.3542, val/caption_bleu4: 0.2327, max mem: 12698.0, lr: 0., time: 04m 33s 303ms, eta: 02h 19m 14s 875ms 2020-03-06T17:02:21 INFO: Evaluation time. Running on full validation set... 2020-03-06T17:02:43 INFO: coco: full val:, 47000/50000, val/total_loss: 2.3120, val/caption_cross_entropy: 2.3120, val/caption_bleu4: 0.2294, validation time: 45m 20s 019ms, best iteration: 24000, best val/caption_bleu4: 0.231642 2020-03-06T17:07:05 INFO: coco:, 47100/50000, train/total_loss: 2.0616 (2.0505), train/caption_cross_entropy: 2.0616 (2.0505), train/caption_bleu4: 0.2643 (0.2682), val/total_loss: 2.4287, val/caption_cross_entropy: 2.4287, val/caption_bleu4: 0.2054, max mem: 12698.0, lr: 0., time: 04m 48s 848ms, eta: 02h 22m 15s 757ms 2020-03-06T17:11:30 INFO: coco:, 47200/50000, train/total_loss: 2.0434 (2.0505), train/caption_cross_entropy: 2.0434 (2.0505), train/caption_bleu4: 0.2704 (0.2682), val/total_loss: 2.3484, val/caption_cross_entropy: 2.3484, val/caption_bleu4: 0.2376, max mem: 12698.0, lr: 0., time: 04m 23s 986ms, eta: 02h 05m 32s 070ms 2020-03-06T17:15:59 INFO: coco:, 47300/50000, train/total_loss: 2.0612 (2.0505), train/caption_cross_entropy: 2.0612 (2.0505), train/caption_bleu4: 0.2634 (0.2682), val/total_loss: 2.2604, val/caption_cross_entropy: 2.2604, val/caption_bleu4: 0.2412, max mem: 12698.0, lr: 0., time: 04m 23s 663ms, eta: 02h 54s 163ms 2020-03-06T17:20:30 INFO: coco:, 47400/50000, train/total_loss: 2.0528 (2.0504), train/caption_cross_entropy: 2.0528 (2.0504), train/caption_bleu4: 0.2738 (0.2682), val/total_loss: 2.2911, val/caption_cross_entropy: 2.2911, val/caption_bleu4: 0.2492, max mem: 12698.0, lr: 0., time: 04m 32s 982ms, eta: 02h 32s 396ms 2020-03-06T17:24:59 INFO: coco:, 47500/50000, train/total_loss: 2.0360 (2.0504), train/caption_cross_entropy: 2.0360 (2.0504), train/caption_bleu4: 0.2662 (0.2682), val/total_loss: 2.3620, val/caption_cross_entropy: 2.3620, val/caption_bleu4: 0.2132, max mem: 12698.0, lr: 0., time: 04m 27s 534ms, eta: 01h 53m 35s 430ms 2020-03-06T17:29:28 INFO: coco:, 47600/50000, train/total_loss: 1.9957 (2.0504), train/caption_cross_entropy: 1.9957 (2.0504), train/caption_bleu4: 0.2736 (0.2683), val/total_loss: 2.3222, val/caption_cross_entropy: 2.3222, val/caption_bleu4: 0.1972, max mem: 12698.0, lr: 0., time: 04m 28s 658ms, eta: 01h 49m 30s 315ms 2020-03-06T17:34:07 INFO: coco:, 47700/50000, train/total_loss: 2.0582 (2.0503), train/caption_cross_entropy: 2.0582 (2.0503), train/caption_bleu4: 0.2675 (0.2683), val/total_loss: 2.3721, val/caption_cross_entropy: 2.3721, val/caption_bleu4: 0.2364, max mem: 12698.0, lr: 0., time: 04m 38s 786ms, eta: 01h 48m 53s 927ms 2020-03-06T17:38:34 INFO: coco:, 47800/50000, train/total_loss: 2.0256 (2.0503), train/caption_cross_entropy: 2.0256 (2.0503), train/caption_bleu4: 0.2718 (0.2683), val/total_loss: 2.4281, val/caption_cross_entropy: 2.4281, val/caption_bleu4: 0.2081, max mem: 12698.0, lr: 0., time: 04m 27s 188ms, eta: 01h 39m 49s 825ms 2020-03-06T17:43:03 INFO: coco:, 47900/50000, train/total_loss: 2.0557 (2.0503), train/caption_cross_entropy: 2.0557 (2.0503), train/caption_bleu4: 0.2647 (0.2683), val/total_loss: 2.5051, val/caption_cross_entropy: 2.5051, val/caption_bleu4: 0.2317, max mem: 12698.0, lr: 0., time: 04m 29s 115ms, eta: 01h 35m 58s 805ms 2020-03-06T17:47:35 INFO: coco:, 48000/50000, train/total_loss: 2.0273 (2.0502), train/caption_cross_entropy: 2.0273 (2.0502), train/caption_bleu4: 0.2621 (0.2683), val/total_loss: 2.2927, val/caption_cross_entropy: 2.2927, val/caption_bleu4: 0.2244, max mem: 12698.0, lr: 0., time: 04m 31s 481ms, eta: 01h 32m 12s 785ms 2020-03-06T17:47:35 INFO: Evaluation time. Running on full validation set... 2020-03-06T17:47:57 INFO: coco: full val:, 48000/50000, val/total_loss: 2.3108, val/caption_cross_entropy: 2.3108, val/caption_bleu4: 0.2292, validation time: 45m 14s 649ms, best iteration: 24000, best val/caption_bleu4: 0.231642 2020-03-06T17:52:22 INFO: coco:, 48100/50000, train/total_loss: 2.0423 (2.0502), train/caption_cross_entropy: 2.0423 (2.0502), train/caption_bleu4: 0.2724 (0.2683), val/total_loss: 2.3355, val/caption_cross_entropy: 2.3355, val/caption_bleu4: 0.2370, max mem: 12698.0, lr: 0., time: 04m 54s 115ms, eta: 01h 34m 54s 366ms 2020-03-06T17:56:45 INFO: coco:, 48200/50000, train/total_loss: 2.0406 (2.0502), train/caption_cross_entropy: 2.0406 (2.0502), train/caption_bleu4: 0.2753 (0.2683), val/total_loss: 2.2388, val/caption_cross_entropy: 2.2388, val/caption_bleu4: 0.2344, max mem: 12698.0, lr: 0., time: 04m 21s 374ms, eta: 01h 19m 54s 138ms 2020-03-06T18:01:18 INFO: coco:, 48300/50000, train/total_loss: 2.0357 (2.0501), train/caption_cross_entropy: 2.0357 (2.0501), train/caption_bleu4: 0.2759 (0.2683), val/total_loss: 2.3267, val/caption_cross_entropy: 2.3267, val/caption_bleu4: 0.2272, max mem: 12698.0, lr: 0., time: 04m 30s 687ms, eta: 01h 18m 09s 119ms 2020-03-06T18:06:15 INFO: coco:, 48400/50000, train/total_loss: 2.0398 (2.0501), train/caption_cross_entropy: 2.0398 (2.0501), train/caption_bleu4: 0.2715 (0.2683), val/total_loss: 2.4509, val/caption_cross_entropy: 2.4509, val/caption_bleu4: 0.2546, max mem: 12698.0, lr: 0., time: 04m 56s 803ms, eta: 01h 20m 39s 079ms 2020-03-06T18:10:48 INFO: coco:, 48500/50000, train/total_loss: 2.0370 (2.0500), train/caption_cross_entropy: 2.0370 (2.0500), train/caption_bleu4: 0.2679 (0.2683), val/total_loss: 2.2444, val/caption_cross_entropy: 2.2444, val/caption_bleu4: 0.2326, max mem: 12698.0, lr: 0., time: 04m 31s 769ms, eta: 01h 09m 14s 003ms 2020-03-06T18:15:21 INFO: coco:, 48600/50000, train/total_loss: 2.0478 (2.0500), train/caption_cross_entropy: 2.0478 (2.0500), train/caption_bleu4: 0.2714 (0.2684), val/total_loss: 2.1358, val/caption_cross_entropy: 2.1358, val/caption_bleu4: 0.2300, max mem: 12698.0, lr: 0., time: 04m 31s 856ms, eta: 01h 04m 38s 305ms 2020-03-06T18:19:50 INFO: coco:, 48700/50000, train/total_loss: 2.0275 (2.0499), train/caption_cross_entropy: 2.0275 (2.0499), train/caption_bleu4: 0.2691 (0.2684), val/total_loss: 2.3374, val/caption_cross_entropy: 2.3374, val/caption_bleu4: 0.2398, max mem: 12698.0, lr: 0., time: 04m 29s 832ms, eta: 59m 34s 467ms 2020-03-06T18:24:19 INFO: coco:, 48800/50000, train/total_loss: 2.0441 (2.0499), train/caption_cross_entropy: 2.0441 (2.0499), train/caption_bleu4: 0.2719 (0.2684), val/total_loss: 2.2969, val/caption_cross_entropy: 2.2969, val/caption_bleu4: 0.2315, max mem: 12698.0, lr: 0., time: 04m 27s 993ms, eta: 54m 37s 026ms 2020-03-06T18:28:49 INFO: coco:, 48900/50000, train/total_loss: 2.0151 (2.0499), train/caption_cross_entropy: 2.0151 (2.0499), train/caption_bleu4: 0.2730 (0.2684), val/total_loss: 2.3260, val/caption_cross_entropy: 2.3260, val/caption_bleu4: 0.2509, max mem: 12698.0, lr: 0., time: 04m 29s 529ms, eta: 50m 21s 157ms 2020-03-06T18:33:17 INFO: coco:, 49000/50000, train/total_loss: 2.0073 (2.0498), train/caption_cross_entropy: 2.0073 (2.0498), train/caption_bleu4: 0.2703 (0.2684), val/total_loss: 2.3228, val/caption_cross_entropy: 2.3228, val/caption_bleu4: 0.2200, max mem: 12698.0, lr: 0., time: 04m 28s 955ms, eta: 45m 40s 656ms 2020-03-06T18:33:17 INFO: Evaluation time. Running on full validation set... 2020-03-06T18:33:39 INFO: coco: full val:, 49000/50000, val/total_loss: 2.3076, val/caption_cross_entropy: 2.3076, val/caption_bleu4: 0.2287, validation time: 45m 41s 783ms, best iteration: 24000, best val/caption_bleu4: 0.231642 2020-03-06T18:38:05 INFO: coco:, 49100/50000, train/total_loss: 2.0091 (2.0498), train/caption_cross_entropy: 2.0091 (2.0498), train/caption_bleu4: 0.2745 (0.2684), val/total_loss: 2.3026, val/caption_cross_entropy: 2.3026, val/caption_bleu4: 0.2265, max mem: 12698.0, lr: 0., time: 04m 53s 762ms, eta: 44m 54s 097ms 2020-03-06T18:42:25 INFO: coco:, 49200/50000, train/total_loss: 2.0304 (2.0497), train/caption_cross_entropy: 2.0304 (2.0497), train/caption_bleu4: 0.2694 (0.2684), val/total_loss: 2.3108, val/caption_cross_entropy: 2.3108, val/caption_bleu4: 0.2054, max mem: 12698.0, lr: 0., time: 04m 18s 473ms, eta: 35m 07s 076ms 2020-03-06T18:46:55 INFO: coco:, 49300/50000, train/total_loss: 2.0408 (2.0498), train/caption_cross_entropy: 2.0408 (2.0498), train/caption_bleu4: 0.2671 (0.2684), val/total_loss: 2.2952, val/caption_cross_entropy: 2.2952, val/caption_bleu4: 0.2397, max mem: 12698.0, lr: 0., time: 04m 23s 326ms, eta: 31m 18s 308ms 2020-03-06T18:51:23 INFO: coco:, 49400/50000, train/total_loss: 2.0319 (2.0497), train/caption_cross_entropy: 2.0319 (2.0497), train/caption_bleu4: 0.2729 (0.2684), val/total_loss: 2.2938, val/caption_cross_entropy: 2.2938, val/caption_bleu4: 0.2209, max mem: 12698.0, lr: 0., time: 04m 30s 428ms, eta: 27m 33s 402ms 2020-03-06T18:55:58 INFO: coco:, 49500/50000, train/total_loss: 2.0255 (2.0497), train/caption_cross_entropy: 2.0255 (2.0497), train/caption_bleu4: 0.2661 (0.2684), val/total_loss: 2.3069, val/caption_cross_entropy: 2.3069, val/caption_bleu4: 0.2487, max mem: 12698.0, lr: 0., time: 04m 34s 185ms, eta: 23m 16s 976ms 2020-03-06T19:00:26 INFO: coco:, 49600/50000, train/total_loss: 2.0469 (2.0496), train/caption_cross_entropy: 2.0469 (2.0496), train/caption_bleu4: 0.2691 (0.2684), val/total_loss: 2.3594, val/caption_cross_entropy: 2.3594, val/caption_bleu4: 0.2241, max mem: 12698.0, lr: 0., time: 04m 27s 680ms, eta: 18m 11s 064ms 2020-03-06T19:04:50 INFO: coco:, 49700/50000, train/total_loss: 2.0394 (2.0496), train/caption_cross_entropy: 2.0394 (2.0496), train/caption_bleu4: 0.2654 (0.2684), val/total_loss: 2.4012, val/caption_cross_entropy: 2.4012, val/caption_bleu4: 0.2178, max mem: 12698.0, lr: 0., time: 04m 25s 145ms, eta: 13m 30s 548ms 2020-03-06T19:09:18 INFO: coco:, 49800/50000, train/total_loss: 2.0424 (2.0496), train/caption_cross_entropy: 2.0424 (2.0496), train/caption_bleu4: 0.2665 (0.2684), val/total_loss: 2.2181, val/caption_cross_entropy: 2.2181, val/caption_bleu4: 0.2415, max mem: 12698.0, lr: 0., time: 04m 27s 457ms, eta: 09m 05s 077ms 2020-03-06T19:13:45 INFO: coco:, 49900/50000, train/total_loss: 2.0238 (2.0497), train/caption_cross_entropy: 2.0238 (2.0497), train/caption_bleu4: 0.2745 (0.2684), val/total_loss: 2.3800, val/caption_cross_entropy: 2.3800, val/caption_bleu4: 0.2127, max mem: 12698.0, lr: 0., time: 04m 30s 161ms, eta: 04m 35s 294ms 2020-03-06T19:18:06 INFO: coco:, 50000/50000, train/total_loss: 2.0534 (2.0496), train/caption_cross_entropy: 2.0534 (2.0496), train/caption_bleu4: 0.2659 (0.2684), val/total_loss: 2.3334, val/caption_cross_entropy: 2.3334, val/caption_bleu4: 0.2189, max mem: 12698.0, lr: 0., time: 04m 21s 867ms, eta: 2020-03-06T19:18:06 INFO: Evaluation time. Running on full validation set... 2020-03-06T19:18:28 INFO: coco: full val:, 50000/50000, val/total_loss: 2.3059, val/caption_cross_entropy: 2.3059, val/caption_bleu4: 0.2296, validation time: 44m 48s 967ms, best iteration: 24000, best val/caption_bleu4: 0.231642 2020-03-06T19:18:29 INFO: Stepping into final validation check 2020-03-06T19:18:29 INFO: Evaluation time. Running on full validation set... 2020-03-06T19:18:51 INFO: coco: full val:, 50001/50000, val/total_loss: 2.3111, val/caption_cross_entropy: 2.3111, val/caption_bleu4: 0.2288, validation time: 22s 814ms, best iteration: 24000, best val/caption_bleu4: 0.231642 2020-03-06T19:18:51 INFO: Restoring checkpoint 2020-03-06T19:18:58 INFO: Starting inference on test set 0%| | 0/20 [00:00<?, ?it/s]2020-03-06T19:19:15 WARNING: /usr/local/python3/lib/python3.7/site-packages/nltk/translate/bleu_score.py:523: UserWarning: The hypothesis contains 0 counts of 3-gram overlaps. Therefore the BLEU score evaluates to 0, independently of how many N-gram overlaps of lower order it contains. Consider using lower n-gram order or use SmoothingFunction() warnings.warn(_msg)

2020-03-06T19:19:15 WARNING: /usr/local/python3/lib/python3.7/site-packages/nltk/translate/bleu_score.py:523: UserWarning: The hypothesis contains 0 counts of 4-gram overlaps. Therefore the BLEU score evaluates to 0, independently of how many N-gram overlaps of lower order it contains. Consider using lower n-gram order or use SmoothingFunction() warnings.warn(_msg)

5%| | 2/20 [00:18<02:43, 9.10s/it] 15%| | 3/20 [00:21<01:59, 7.04s/it] 20%| | 4/20 [00:24<01:36, 6.01s/it] 25%| | 6/20 [00:29<01:09, 4.97s/it] 35%| | 7/20 [00:32<01:00, 4.68s/it] 40%| | 8/20 [00:35<00:53, 4.48s/it] 45%| | 10/20 [00:41<00:41, 4.15s/it] 55%| | 11/20 [00:44<00:36, 4.04s/it] 60%| | 12/20 [00:47<00:31, 3.94s/it] 65%| | 14/20 [00:52<00:22, 3.78s/it] 75%| | 15/20 [00:55<00:18, 3.72s/it] 80%| | 16/20 [00:58<00:14, 3.66s/it] 85%| | 18/20 [01:04<00:07, 3.57s/it] 95%|| 19/20 [01:07<00:03, 3.54s/it] 100%|| 20/20 [01:08<00:00, 3.44s/it]2020-03-06T19:20:09 INFO: coco: full test:, 50001/50000, test/total_loss: 23.1343, test/caption_cross_entropy: 23.1343, test/caption_bleu4: 0.0012

cylvzj commented 4 years ago

@apsdehal Sorry Where to set softmax weights of the already predicted word to -inf ?

apsdehal commented 4 years ago

In https://github.com/facebookresearch/pythia/blob/master/pythia/utils/text_utils.py#L288..L295, make sure top k doesn't contain same word by setting scores of already predicted to -inf.

cylvzj commented 4 years ago

If the prediction is that one person plays basketball , the other person plays football. If setting scores of already predicted to -inf, Does “person” only appear once?

apsdehal commented 4 years ago

Yes, that's the case. Btw, which commit you are on? Something seems fishy in your log.

cylvzj commented 4 years ago

Setting scores of already predicted to -inf. This is still problematic, I am using caption training now

apsdehal commented 4 years ago

For further comments, I would need to know what Git commit id you are on.

cylvzj commented 4 years ago

I just download ZIP and used the caption function. Chinese word segmentation dataset。

apsdehal commented 4 years ago

Closing as stale. If the issue persists, please open up a new issue.