ruotianluo / self-critical.pytorch

Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.
MIT License
995 stars 278 forks source link

Chinese image caption, In the result, multiple words of the same type appear #169

Open cylvzj opened 4 years ago

cylvzj commented 4 years ago

Hello, I am using the COCO dataset, A two-layer LSTM model, one layer for top-down attention, and one layer for language models.

Extracting words with jieba I used all the words in the picture description that occurred more than 3 times as a dictionary file, and a total of 14,226 words. words = [w for w in word_freq.keys () if word_freq [w]> 3]

After training the model, when using it, multiple words of the same type appear in the result, such as:

Note notebook laptop computer on bed A little girl little girl girl standing together

How can I solve this problem?

ruotianluo commented 4 years ago

never met this problem before.

cylvzj commented 4 years ago

@ruotianluo Thanks for the reply,How many words are in your dictionary?

ruotianluo commented 4 years ago

~10000.

cylvzj commented 4 years ago

Any suggestions on this issue?

ruotianluo commented 4 years ago

By the way, why the chinese caption is in english.

cylvzj commented 4 years ago

I translated the picture description into Chinese。

ruotianluo commented 4 years ago

how long have you trained.

cylvzj commented 4 years ago

More than 37 hours, trained 50,000 batches on a 24G GPU machine

ruotianluo commented 4 years ago

That's wierd. Are you sure the training data is correct?

cylvzj commented 4 years ago

What do you think is the problem with the data set? I probably saw that the picture description is translated into Chinese, and the larger part is correct.

It is an open source model. The data set is COCO2014 (the picture is described in English). I just translated it into Chinese and then used jieba to extract the word segmentation. Nothing else has changed

ruotianluo commented 4 years ago

Maybe you did something wrong when preprocessing the dataset using jieba. That's what I meant.

cylvzj commented 4 years ago

I translated all the picture descriptions into Chinese, and then used jieba to extract the word segmentation based on all the Chinese descriptions. Nothing else changed

ruotianluo commented 4 years ago

Is it possible that you did something wrong when preprocessing. I really have no clue, and just random guessing. I have never seen this before so it's hard for me to diagnose.

cylvzj commented 4 years ago

Ok thank you for your reply

cylvzj commented 4 years ago

请问一下, 怎样减少奖励权重 (reduce the weight of the concept reward)?

ruotianluo commented 4 years ago

reduce the learning rate?

cylvzj commented 4 years ago

No, this weight should be selected when it determines the predicted word probability

ruotianluo commented 4 years ago

no idea.

cylvzj commented 4 years ago

thanks

cylvzj commented 4 years ago

桌子 上 有 一个 杯子 和 一个 杯子 蓝色 和 蓝色 的 被子 和 蓝色 的 被子 一个 橙色 的 盘子 , 上面 有 一个 橙色 的 飞盘

请教一下, 输出这样的结果,一般是什么问题?

ruotianluo commented 4 years ago

可能训练的不够。出现这样挺正常的

cylvzj commented 4 years ago

训练40000 和 50000 批次, 感觉差不多。 2020-02-24T01:17:16 INFO: coco:, 39800/50000, train/total_loss: 2.0268 (2.0423), train/caption_cross_entropy: 2.0268 (2.0423), train/caption_bleu4: 0.2641 (0.2690), val/total_loss: 2.2847, val/caption_cross_entropy: 2.2847, val/caption_bleu4: 0.1683, max mem: 12510.0, lr: 0.00001, time: 04m 35s 790ms, eta: 07h 57m 45s 118ms

2020-02-24T01:22:00 INFO: coco:, 39900/50000, train/total_loss: 2.0368 (2.0426), train/caption_cross_entropy: 2.0368 (2.0426), train/caption_bleu4: 0.2672 (0.2690), val/total_loss: 2.3601, val/caption_cross_entropy: 2.3601, val/caption_bleu4: 0.1625, max mem: 12510.0, lr: 0.00001, time: 04m 43s 245ms, eta: 08h 05m 51s 306ms

2020-02-24T01:26:44 INFO: coco:, 40000/50000, train/total_loss: 2.0474 (2.0425), train/caption_cross_entropy: 2.0474 (2.0425), train/caption_bleu4: 0.2655 (0.2689), val/total_loss: 2.3042, val/caption_cross_entropy: 2.3042, val/caption_bleu4: 0.1571, max mem: 12510.0, lr: 0.00001, time: 04m 44s 208ms, eta: 08h 02m 40s 854ms

2020-02-24T08:34:58 INFO: coco:, 49100/50000, train/total_loss: 2.0535 (2.0419), train/caption_cross_entropy: 2.0535 (2.0419), train/caption_bleu4: 0.2639 (0.2691), val/total_loss: 2.3923, val/caption_cross_entropy: 2.3923, val/caption_bleu4: 0.1630, max mem: 12568.0, lr: 0., time: 05m 05s 916ms, eta: 46m 45s 561ms

2020-02-24T08:39:37 INFO: coco:, 49200/50000, train/total_loss: 2.0376 (2.0419), train/caption_cross_entropy: 2.0376 (2.0419), train/caption_bleu4: 0.2690 (0.2691), val/total_loss: 2.3311, val/caption_cross_entropy: 2.3311, val/caption_bleu4: 0.1546, max mem: 12568.0, lr: 0., time: 04m 32s 160ms, eta: 36m 58s 655ms

2020-02-24T08:44:15 INFO: coco:, 49300/50000, train/total_loss: 2.0380 (2.0419), train/caption_cross_entropy: 2.0380 (2.0419), train/caption_bleu4: 0.2708 (0.2691), val/total_loss: 2.3666, val/caption_cross_entropy: 2.3666, val/caption_bleu4: 0.1588, max mem: 12568.0, lr: 0., time: 04m 39s 535ms, eta: 33m 13s 929ms

ruotianluo commented 4 years ago

那是多少epoch?

cylvzj commented 4 years ago

前面几个是 40000 epoch, 后面机构是49000 epoch。 上面是日志输出

ruotianluo commented 4 years ago

那个是49000 iteration.

cylvzj commented 4 years ago

2020-02-24T01:26:44 INFO: coco:, 40000/50000, train/total_loss: 2.0474 (2.0425), train/caption_cross_entropy: 2.0474 (2.0425), train/caption_bleu4: 0.2655 (0.2689), val/total_loss: 2.3042, val/caption_cross_entropy: 2.3042, val/caption_bleu4: 0.1571, max mem: 12510.0, lr: 0.00001, time: 04m 44s 208ms, eta: 08h 02m 40s 854ms 我只复制了一部分, 上面的 和 下面的

2020-02-24T08:34:58 INFO: coco:, 49100/50000, train/total_loss: 2.0535 (2.0419), train/caption_cross_entropy: 2.0535 (2.0419), train/caption_bleu4: 0.2639 (0.2691), val/total_loss: 2.3923, val/caption_cross_entropy: 2.3923, val/caption_bleu4: 0.1630, max mem: 12568.0, lr: 0., time: 05m 05s 916ms, eta: 46m 45s 561ms

ruotianluo commented 4 years ago
  1. 你learning rate已经降没了,你可能需要改一下你的learning rate schedule,学的会更好?2 我不知道你batch size,所以不知道这是多少epoch。但是我一般训练个30epoch左右。
cylvzj commented 4 years ago

可以和我说说 下面几个代表的意思吗? 什么时候 就可以停止训练了。 这个模型是其他公司的。

train/total_loss: 2.0535 (2.0419), train/caption_cross_entropy: 2.0535 (2.0419), train/caption_bleu4: 0.2639 (0.2691), val/total_loss: 2.3923, val/caption_cross_entropy: 2.3923, val/caption_bleu4: 0.1630,

ruotianluo commented 4 years ago

cross_entorpy越小越好 bleu4越大越好

ruotianluo commented 4 years ago

什么时候bleu4不涨了,再停止训练

cylvzj commented 4 years ago

哦。谢谢. 可以加一下 你的微信或qq?

ruotianluo commented 4 years ago

可以发我邮箱 rluo@ttic.edu

cylvzj commented 4 years ago

好,我的邮箱是 1356887876@qq.com

cylvzj commented 4 years ago

训练日志输出这样 是不是不用再继续了?

2020-03-03T10:39:14 INFO: coco:, 33400/40000, train/total_loss: 2.0911 (2.0788), train/caption_cross_entropy: 2.0911 (2.0788), train/caption_bleu4: 0.2665 (0.2643), val/total_loss: 2.2728, val/caption_cross_entropy: 2.2728, val/caption_bleu4: 0.2453, max mem: 12574.0, lr: 0.0001, time: 04m 21s 524ms, eta: 04h 53m 08s 540ms 2020-03-03T10:43:39 INFO: coco:, 33500/40000, train/total_loss: 2.0284 (2.0786), train/caption_cross_entropy: 2.0284 (2.0786), train/caption_bleu4: 0.2674 (0.2643), val/total_loss: 2.2827, val/caption_cross_entropy: 2.2827, val/caption_bleu4: 0.2282, max mem: 12574.0, lr: 0.0001, time: 04m 30s 066ms, eta: 04h 58m 07s 870ms 2020-03-03T10:48:15 INFO: coco:, 33600/40000, train/total_loss: 2.0178 (2.0784), train/caption_cross_entropy: 2.0178 (2.0784), train/caption_bleu4: 0.2693 (0.2644), val/total_loss: 2.3300, val/caption_cross_entropy: 2.3300, val/caption_bleu4: 0.2324, max mem: 12574.0, lr: 0.0001, time: 04m 32s 646ms, eta: 04h 56m 20s 925ms 2020-03-03T10:52:33 INFO: coco:, 33700/40000, train/total_loss: 2.0367 (2.0782), train/caption_cross_entropy: 2.0367 (2.0782), train/caption_bleu4: 0.2658 (0.2644), val/total_loss: 2.3623, val/caption_cross_entropy: 2.3623, val/caption_bleu4: 0.2030, max mem: 12574.0, lr: 0.0001, time: 04m 22s 382ms, eta: 04h 40m 44s 154ms 2020-03-03T10:57:04 INFO: coco:, 33800/40000, train/total_loss: 2.0039 (2.0780), train/caption_cross_entropy: 2.0039 (2.0780), train/caption_bleu4: 0.2758 (0.2645), val/total_loss: 2.3419, val/caption_cross_entropy: 2.3419, val/caption_bleu4: 0.2217, max mem: 12574.0, lr: 0.0001, time: 04m 21s 578ms, eta: 04h 35m 26s 035ms 2020-03-03T11:01:40 INFO: coco:, 33900/40000, train/total_loss: 2.0435 (2.0777), train/caption_cross_entropy: 2.0435 (2.0777), train/caption_bleu4: 0.2682 (0.2645), val/total_loss: 2.3244, val/caption_cross_entropy: 2.3244, val/caption_bleu4: 0.2292, max mem: 12574.0, lr: 0.0001, time: 04m 33s 681ms, eta: 04h 43m 31s 792ms 2020-03-03T11:06:12 INFO: coco:, 34000/40000, train/total_loss: 2.0485 (2.0775), train/caption_cross_entropy: 2.0485 (2.0775), train/caption_bleu4: 0.2573 (0.2645), val/total_loss: 2.3093, val/caption_cross_entropy: 2.3093, val/caption_bleu4: 0.2301, max mem: 12574.0, lr: 0.0001, time: 04m 35s 382ms, eta: 04h 40m 36s 888ms 2020-03-03T11:06:12 INFO: Evaluation time. Running on full validation set... 2020-03-03T11:06:38 INFO: coco: full val:, 34000/40000, val/total_loss: 2.3042, val/caption_cross_entropy: 2.3042, val/caption_bleu4: 0.2284, validation time: 44m 52s 106ms, best iteration: 26000, best val/caption_bleu4: 0.229085 2020-03-03T11:11:05 INFO: coco:, 34100/40000, train/total_loss: 2.0442 (2.0773), train/caption_cross_entropy: 2.0442 (2.0773), train/caption_bleu4: 0.2743 (0.2646), val/total_loss: 2.2786, val/caption_cross_entropy: 2.2786, val/caption_bleu4: 0.2358, max mem: 12574.0, lr: 0.0001, time: 04m 57s 185ms, eta: 04h 57m 47s 080ms 2020-03-03T11:15:23 INFO: coco:, 34200/40000, train/total_loss: 2.0499 (2.0770), train/caption_cross_entropy: 2.0499 (2.0770), train/caption_bleu4: 0.2713 (0.2646), val/total_loss: 2.2495, val/caption_cross_entropy: 2.2495, val/caption_bleu4: 0.2453, max mem: 12574.0, lr: 0.0001, time: 04m 24s 577ms, eta: 04h 20m 37s 043ms 2020-03-03T11:19:45 INFO: coco:, 34300/40000, train/total_loss: 2.0407 (2.0769), train/caption_cross_entropy: 2.0407 (2.0769), train/caption_bleu4: 0.2710 (0.2646), val/total_loss: 2.3185, val/caption_cross_entropy: 2.3185, val/caption_bleu4: 0.2151, max mem: 12574.0, lr: 0.0001, time: 04m 17s 674ms, eta: 04h 09m 26s 502ms 2020-03-03T11:24:25 INFO: coco:, 34400/40000, train/total_loss: 2.0426 (2.0766), train/caption_cross_entropy: 2.0426 (2.0766), train/caption_bleu4: 0.2756 (0.2647), val/total_loss: 2.4307, val/caption_cross_entropy: 2.4307, val/caption_bleu4: 0.2344, max mem: 12574.0, lr: 0.0001, time: 04m 23s 375ms, eta: 04h 10m 29s 243ms 2020-03-03T11:28:48 INFO: coco:, 34500/40000, train/total_loss: 2.0268 (2.0765), train/caption_cross_entropy: 2.0268 (2.0765), train/caption_bleu4: 0.2673 (0.2647), val/total_loss: 2.2621, val/caption_cross_entropy: 2.2621, val/caption_bleu4: 0.2593, max mem: 12574.0, lr: 0.0001, time: 04m 38s 250ms, eta: 04h 19m 54s 530ms 2020-03-03T11:33:07 INFO: coco:, 34600/40000, train/total_loss: 2.0285 (2.0763), train/caption_cross_entropy: 2.0285 (2.0763), train/caption_bleu4: 0.2726 (0.2647), val/total_loss: 2.3598, val/caption_cross_entropy: 2.3598, val/caption_bleu4: 0.2249, max mem: 12574.0, lr: 0.0001, time: 04m 21s 055ms, eta: 03h 59m 24s 867ms 2020-03-03T11:37:37 INFO: coco:, 34700/40000, train/total_loss: 2.0751 (2.0761), train/caption_cross_entropy: 2.0751 (2.0761), train/caption_bleu4: 0.2577 (0.2647), val/total_loss: 2.2264, val/caption_cross_entropy: 2.2264, val/caption_bleu4: 0.2418, max mem: 12574.0, lr: 0.0001, time: 04m 25s 829ms, eta: 03h 59m 16s 656ms 2020-03-03T11:42:02 INFO: coco:, 34800/40000, train/total_loss: 2.0269 (2.0760), train/caption_cross_entropy: 2.0269 (2.0760), train/caption_bleu4: 0.2728 (0.2648), val/total_loss: 2.3377, val/caption_cross_entropy: 2.3377, val/caption_bleu4: 0.2223, max mem: 12574.0, lr: 0.0001, time: 04m 28s 889ms, eta: 03h 57m 27s 913ms 2020-03-03T11:46:41 INFO: coco:, 34900/40000, train/total_loss: 2.0461 (2.0758), train/caption_cross_entropy: 2.0461 (2.0758), train/caption_bleu4: 0.2593 (0.2648), val/total_loss: 2.2708, val/caption_cross_entropy: 2.2708, val/caption_bleu4: 0.2128, max mem: 12574.0, lr: 0.0001, time: 04m 27s 325ms, eta: 03h 51m 32s 642ms 2020-03-03T11:51:03 INFO: coco:, 35000/40000, train/total_loss: 2.0476 (2.0757), train/caption_cross_entropy: 2.0476 (2.0757), train/caption_bleu4: 0.2727 (0.2648), val/total_loss: 2.3046, val/caption_cross_entropy: 2.3046, val/caption_bleu4: 0.2329, max mem: 12574.0, lr: 0.00001, time: 04m 33s 795ms, eta: 03h 52m 29s 880ms 2020-03-03T11:51:03 INFO: Evaluation time. Running on full validation set... 2020-03-03T11:51:24 INFO: coco: full val:, 35000/40000, val/total_loss: 2.3067, val/caption_cross_entropy: 2.3067, val/caption_bleu4: 0.2275, validation time: 44m 46s 487ms, best iteration: 26000, best val/caption_bleu4: 0.229085 2020-03-03T11:55:47 INFO: coco:, 35100/40000, train/total_loss: 2.0400 (2.0754), train/caption_cross_entropy: 2.0400 (2.0754), train/caption_bleu4: 0.2681 (0.2648), val/total_loss: 2.1647, val/caption_cross_entropy: 2.1647, val/caption_bleu4: 0.2541, max mem: 12574.0, lr: 0.00001, time: 04m 46s 228ms, eta: 03h 58m 11s 685ms 2020-03-03T12:00:13 INFO: coco:, 35200/40000, train/total_loss: 2.0492 (2.0753), train/caption_cross_entropy: 2.0492 (2.0753), train/caption_bleu4: 0.2624 (0.2649), val/total_loss: 2.2340, val/caption_cross_entropy: 2.2340, val/caption_bleu4: 0.2263, max mem: 12574.0, lr: 0.00001, time: 04m 26s 059ms, eta: 03h 36m 53s 480ms 2020-03-03T12:04:35 INFO: coco:, 35300/40000, train/total_loss: 2.0491 (2.0751), train/caption_cross_entropy: 2.0491 (2.0751), train/caption_bleu4: 0.2742 (0.2649), val/total_loss: 2.3324, val/caption_cross_entropy: 2.3324, val/caption_bleu4: 0.2307, max mem: 12574.0, lr: 0.00001, time: 04m 23s 462ms, eta: 03h 30m 18s 018ms 2020-03-03T12:09:12 INFO: coco:, 35400/40000, train/total_loss: 2.0542 (2.0749), train/caption_cross_entropy: 2.0542 (2.0749), train/caption_bleu4: 0.2678 (0.2649), val/total_loss: 2.2169, val/caption_cross_entropy: 2.2169, val/caption_bleu4: 0.2360, max mem: 12574.0, lr: 0.00001, time: 04m 17s 700ms, eta: 03h 21m 19s 473ms 2020-03-03T12:13:34 INFO: coco:, 35500/40000, train/total_loss: 2.0490 (2.0747), train/caption_cross_entropy: 2.0490 (2.0747), train/caption_bleu4: 0.2650 (0.2649), val/total_loss: 2.3178, val/caption_cross_entropy: 2.3178, val/caption_bleu4: 0.2417, max mem: 12574.0, lr: 0.00001, time: 04m 37s 020ms, eta: 03h 31m 42s 777ms 2020-03-03T12:17:59 INFO: coco:, 35600/40000, train/total_loss: 2.0391 (2.0745), train/caption_cross_entropy: 2.0391 (2.0745), train/caption_bleu4: 0.2667 (0.2650), val/total_loss: 2.3760, val/caption_cross_entropy: 2.3760, val/caption_bleu4: 0.2324, max mem: 12574.0, lr: 0.00001, time: 04m 24s 780ms, eta: 03h 17m 51s 708ms 2020-03-03T12:22:35 INFO: coco:, 35700/40000, train/total_loss: 2.0455 (2.0743), train/caption_cross_entropy: 2.0455 (2.0743), train/caption_bleu4: 0.2679 (0.2650), val/total_loss: 2.2623, val/caption_cross_entropy: 2.2623, val/caption_bleu4: 0.2139, max mem: 12574.0, lr: 0.00001, time: 04m 25s 966ms, eta: 03h 14m 13s 876ms 2020-03-03T12:26:53 INFO: coco:, 35800/40000, train/total_loss: 2.0425 (2.0741), train/caption_cross_entropy: 2.0425 (2.0741), train/caption_bleu4: 0.2760 (0.2650), val/total_loss: 2.2622, val/caption_cross_entropy: 2.2622, val/caption_bleu4: 0.2193, max mem: 12574.0, lr: 0.00001, time: 04m 30s 894ms, eta: 03h 13m 13s 738ms 2020-03-03T12:31:18 INFO: coco:, 35900/40000, train/total_loss: 2.0472 (2.0739), train/caption_cross_entropy: 2.0472 (2.0739), train/caption_bleu4: 0.2672 (0.2651), val/total_loss: 2.2753, val/caption_cross_entropy: 2.2753, val/caption_bleu4: 0.2281, max mem: 12574.0, lr: 0.00001, time: 04m 20s 356ms, eta: 03h 01m 17s 451ms 2020-03-03T12:35:44 INFO: coco:, 36000/40000, train/total_loss: 2.0267 (2.0737), train/caption_cross_entropy: 2.0267 (2.0737), train/caption_bleu4: 0.2670 (0.2651), val/total_loss: 2.3721, val/caption_cross_entropy: 2.3721, val/caption_bleu4: 0.2236, max mem: 12574.0, lr: 0.00001, time: 04m 22s 392ms, eta: 02h 58m 15s 109ms 2020-03-03T12:35:44 INFO: Evaluation time. Running on full validation set... 2020-03-03T12:36:06 INFO: coco: full val:, 36000/40000, val/total_loss: 2.2997, val/caption_cross_entropy: 2.2997, val/caption_bleu4: 0.2290, validation time: 44m 42s 015ms, best iteration: 26000, best val/caption_bleu4: 0.229085 2020-03-03T12:40:26 INFO: coco:, 36100/40000, train/total_loss: 2.0610 (2.0736), train/caption_cross_entropy: 2.0610 (2.0736), train/caption_bleu4: 0.2683 (0.2651), val/total_loss: 2.3138, val/caption_cross_entropy: 2.3138, val/caption_bleu4: 0.2238, max mem: 12574.0, lr: 0.00001, time: 04m 54s 618ms, eta: 03h 15m 08s 416ms 2020-03-03T12:44:42 INFO: coco:, 36200/40000, train/total_loss: 2.0337 (2.0734), train/caption_cross_entropy: 2.0337 (2.0734), train/caption_bleu4: 0.2717 (0.2651), val/total_loss: 2.3323, val/caption_cross_entropy: 2.3323, val/caption_bleu4: 0.2438, max mem: 12574.0, lr: 0.00001, time: 04m 14s 479ms, eta: 02h 44m 13s 952ms 2020-03-03T12:49:14 INFO: coco:, 36300/40000, train/total_loss: 2.0499 (2.0732), train/caption_cross_entropy: 2.0499 (2.0732), train/caption_bleu4: 0.2724 (0.2652), val/total_loss: 2.2358, val/caption_cross_entropy: 2.2358, val/caption_bleu4: 0.2274, max mem: 12574.0, lr: 0.00001, time: 04m 18s 571ms, eta: 02h 42m 28s 927ms 2020-03-03T12:53:34 INFO: coco:, 36400/40000, train/total_loss: 2.0128 (2.0730), train/caption_cross_entropy: 2.0128 (2.0730), train/caption_bleu4: 0.2708 (0.2652), val/total_loss: 2.3435, val/caption_cross_entropy: 2.3435, val/caption_bleu4: 0.2160, max mem: 12574.0, lr: 0.00001, time: 04m 28s 643ms, eta: 02h 44m 14s 921ms 2020-03-03T12:58:00 INFO: coco:, 36500/40000, train/total_loss: 2.0673 (2.0729), train/caption_cross_entropy: 2.0673 (2.0729), train/caption_bleu4: 0.2750 (0.2652), val/total_loss: 2.3058, val/caption_cross_entropy: 2.3058, val/caption_bleu4: 0.2087, max mem: 12574.0, lr: 0.00001, time: 04m 24s 597ms, eta: 02h 37m 16s 881ms 2020-03-03T13:02:29 INFO: coco:, 36600/40000, train/total_loss: 2.0579 (2.0728), train/caption_cross_entropy: 2.0579 (2.0728), train/caption_bleu4: 0.2676 (0.2652), val/total_loss: 2.3993, val/caption_cross_entropy: 2.3993, val/caption_bleu4: 0.2311, max mem: 12574.0, lr: 0.00001, time: 04m 19s 270ms, eta: 02h 29m 42s 686ms 2020-03-03T13:06:56 INFO: coco:, 36700/40000, train/total_loss: 2.0329 (2.0726), train/caption_cross_entropy: 2.0329 (2.0726), train/caption_bleu4: 0.2654 (0.2653), val/total_loss: 2.2875, val/caption_cross_entropy: 2.2875, val/caption_bleu4: 0.2283, max mem: 12574.0, lr: 0.00001, time: 04m 37s 601ms, eta: 02h 35m 34s 921ms 2020-03-03T13:11:24 INFO: coco:, 36800/40000, train/total_loss: 2.0261 (2.0724), train/caption_cross_entropy: 2.0261 (2.0724), train/caption_bleu4: 0.2672 (0.2653), val/total_loss: 2.2811, val/caption_cross_entropy: 2.2811, val/caption_bleu4: 0.2513, max mem: 12574.0, lr: 0.00001, time: 04m 29s 517ms, eta: 02h 26m 28s 423ms 2020-03-03T13:15:46 INFO: coco:, 36900/40000, train/total_loss: 2.0637 (2.0723), train/caption_cross_entropy: 2.0637 (2.0723), train/caption_bleu4: 0.2695 (0.2653), val/total_loss: 2.3830, val/caption_cross_entropy: 2.3830, val/caption_bleu4: 0.2066, max mem: 12574.0, lr: 0.00001, time: 04m 19s 921ms, eta: 02h 16m 50s 670ms 2020-03-03T13:20:08 INFO: coco:, 37000/40000, train/total_loss: 2.0243 (2.0721), train/caption_cross_entropy: 2.0243 (2.0721), train/caption_bleu4: 0.2691 (0.2653), val/total_loss: 2.3164, val/caption_cross_entropy: 2.3164, val/caption_bleu4: 0.2324, max mem: 12574.0, lr: 0.00001, time: 04m 24s 697ms, eta: 02h 14m 51s 811ms 2020-03-03T13:20:08 INFO: Evaluation time. Running on full validation set... 2020-03-03T13:20:29 INFO: coco: full val:, 37000/40000, val/total_loss: 2.3038, val/caption_cross_entropy: 2.3038, val/caption_bleu4: 0.2279, validation time: 44m 22s 874ms, best iteration: 26000, best val/caption_bleu4: 0.229085 2020-03-03T13:24:47 INFO: coco:, 37100/40000, train/total_loss: 2.0488 (2.0719), train/caption_cross_entropy: 2.0488 (2.0719), train/caption_bleu4: 0.2740 (0.2654), val/total_loss: 2.3804, val/caption_cross_entropy: 2.3804, val/caption_bleu4: 0.2183, max mem: 12574.0, lr: 0.00001, time: 04m 43s 780ms, eta: 02h 19m 46s 006ms 2020-03-03T13:29:09 INFO: coco:, 37200/40000, train/total_loss: 2.0525 (2.0718), train/caption_cross_entropy: 2.0525 (2.0718), train/caption_bleu4: 0.2758 (0.2654), val/total_loss: 2.2791, val/caption_cross_entropy: 2.2791, val/caption_bleu4: 0.2286, max mem: 12574.0, lr: 0.00001, time: 04m 16s 513ms, eta: 02h 01m 58s 833ms 2020-03-03T13:33:44 INFO: coco:, 37300/40000, train/total_loss: 2.0436 (2.0717), train/caption_cross_entropy: 2.0436 (2.0717), train/caption_bleu4: 0.2657 (0.2654), val/total_loss: 2.3565, val/caption_cross_entropy: 2.3565, val/caption_bleu4: 0.2050, max mem: 12574.0, lr: 0.00001, time: 04m 27s 130ms, eta: 02h 02m 29s 574ms 2020-03-03T13:38:09 INFO: coco:, 37400/40000, train/total_loss: 2.0450 (2.0715), train/caption_cross_entropy: 2.0450 (2.0715), train/caption_bleu4: 0.2686 (0.2655), val/total_loss: 2.3162, val/caption_cross_entropy: 2.3162, val/caption_bleu4: 0.2322, max mem: 12574.0, lr: 0.00001, time: 04m 33s 867ms, eta: 02h 55s 843ms 2020-03-03T13:42:36 INFO: coco:, 37500/40000, train/total_loss: 2.0314 (2.0714), train/caption_cross_entropy: 2.0314 (2.0714), train/caption_bleu4: 0.2680 (0.2655), val/total_loss: 2.3106, val/caption_cross_entropy: 2.3106, val/caption_bleu4: 0.2149, max mem: 12574.0, lr: 0.00001, time: 04m 20s 198ms, eta: 01h 50m 28s 568ms 2020-03-03T13:47:03 INFO: coco:, 37600/40000, train/total_loss: 2.0702 (2.0712), train/caption_cross_entropy: 2.0702 (2.0712), train/caption_bleu4: 0.2624 (0.2655), val/total_loss: 2.2327, val/caption_cross_entropy: 2.2327, val/caption_bleu4: 0.2431, max mem: 12574.0, lr: 0.00001, time: 04m 29s 837ms, eta: 01h 49m 59s 153ms 2020-03-03T13:51:35 INFO: coco:, 37700/40000, train/total_loss: 2.0149 (2.0711), train/caption_cross_entropy: 2.0149 (2.0711), train/caption_bleu4: 0.2700 (0.2655), val/total_loss: 2.4083, val/caption_cross_entropy: 2.4083, val/caption_bleu4: 0.2072, max mem: 12574.0, lr: 0.00001, time: 04m 23s 778ms, eta: 01h 43m 02s 167ms 2020-03-03T13:55:59 INFO: coco:, 37800/40000, train/total_loss: 2.0319 (2.0709), train/caption_cross_entropy: 2.0319 (2.0709), train/caption_bleu4: 0.2749 (0.2655), val/total_loss: 2.3759, val/caption_cross_entropy: 2.3759, val/caption_bleu4: 0.2340, max mem: 12574.0, lr: 0.00001, time: 04m 32s 761ms, eta: 01h 41m 54s 764ms 2020-03-03T14:00:35 INFO: coco:, 37900/40000, train/total_loss: 2.0277 (2.0708), train/caption_cross_entropy: 2.0277 (2.0708), train/caption_bleu4: 0.2718 (0.2656), val/total_loss: 2.3239, val/caption_cross_entropy: 2.3239, val/caption_bleu4: 0.2240, max mem: 12574.0, lr: 0.00001, time: 04m 26s 154ms, eta: 01h 34m 55s 447ms 2020-03-03T14:04:58 INFO: coco:, 38000/40000, train/total_loss: 2.0384 (2.0706), train/caption_cross_entropy: 2.0384 (2.0706), train/caption_bleu4: 0.2636 (0.2656), val/total_loss: 2.4051, val/caption_cross_entropy: 2.4051, val/caption_bleu4: 0.2273, max mem: 12574.0, lr: 0.00001, time: 04m 33s 169ms, eta: 01h 32m 47s 192ms 2020-03-03T14:04:58 INFO: Evaluation time. Running on full validation set... 2020-03-03T14:05:21 INFO: coco: full val:, 38000/40000, val/total_loss: 2.3054, val/caption_cross_entropy: 2.3054, val/caption_bleu4: 0.2277, validation time: 44m 51s 845ms, best iteration: 26000, best val/caption_bleu4: 0.229085 2020-03-03T14:09:43 INFO: coco:, 38100/40000, train/total_loss: 2.0440 (2.0705), train/caption_cross_entropy: 2.0440 (2.0705), train/caption_bleu4: 0.2712 (0.2656), val/total_loss: 2.2486, val/caption_cross_entropy: 2.2486, val/caption_bleu4: 0.2155, max mem: 12574.0, lr: 0.00001, time: 04m 48s 182ms, eta: 01h 32m 59s 497ms 2020-03-03T14:13:59 INFO: coco:, 38200/40000, train/total_loss: 2.0326 (2.0703), train/caption_cross_entropy: 2.0326 (2.0703), train/caption_bleu4: 0.2676 (0.2656), val/total_loss: 2.3013, val/caption_cross_entropy: 2.3013, val/caption_bleu4: 0.2301, max mem: 12574.0, lr: 0.00001, time: 04m 19s 289ms, eta: 01h 19m 15s 891ms 2020-03-03T14:18:28 INFO: coco:, 38300/40000, train/total_loss: 2.0218 (2.0702), train/caption_cross_entropy: 2.0218 (2.0702), train/caption_bleu4: 0.2731 (0.2657), val/total_loss: 2.3932, val/caption_cross_entropy: 2.3932, val/caption_bleu4: 0.1910, max mem: 12574.0, lr: 0.00001, time: 04m 21s 628ms, eta: 01h 15m 32s 194ms 2020-03-03T14:22:57 INFO: coco:, 38400/40000, train/total_loss: 2.0484 (2.0700), train/caption_cross_entropy: 2.0484 (2.0700), train/caption_bleu4: 0.2586 (0.2657), val/total_loss: 2.2354, val/caption_cross_entropy: 2.2354, val/caption_bleu4: 0.2427, max mem: 12574.0, lr: 0.00001, time: 04m 26s 787ms, eta: 01h 12m 29s 703ms 2020-03-03T14:27:25 INFO: coco:, 38500/40000, train/total_loss: 2.0545 (2.0699), train/caption_cross_entropy: 2.0545 (2.0699), train/caption_bleu4: 0.2597 (0.2657), val/total_loss: 2.2127, val/caption_cross_entropy: 2.2127, val/caption_bleu4: 0.2588, max mem: 12574.0, lr: 0.00001, time: 04m 35s 552ms, eta: 01h 10m 11s 813ms 2020-03-03T14:31:53 INFO: coco:, 38600/40000, train/total_loss: 2.0449 (2.0698), train/caption_cross_entropy: 2.0449 (2.0698), train/caption_bleu4: 0.2656 (0.2657), val/total_loss: 2.2936, val/caption_cross_entropy: 2.2936, val/caption_bleu4: 0.2359, max mem: 12574.0, lr: 0.00001, time: 04m 21s 651ms, eta: 01h 02m 12s 723ms 2020-03-03T14:36:15 INFO: coco:, 38700/40000, train/total_loss: 2.0270 (2.0696), train/caption_cross_entropy: 2.0270 (2.0696), train/caption_bleu4: 0.2725 (0.2657), val/total_loss: 2.2924, val/caption_cross_entropy: 2.2924, val/caption_bleu4: 0.2195, max mem: 12574.0, lr: 0.00001, time: 04m 25s 315ms, eta: 58m 34s 639ms 2020-03-03T14:40:44 INFO: coco:, 38800/40000, train/total_loss: 2.0237 (2.0695), train/caption_cross_entropy: 2.0237 (2.0695), train/caption_bleu4: 0.2759 (0.2658), val/total_loss: 2.3763, val/caption_cross_entropy: 2.3763, val/caption_bleu4: 0.2273, max mem: 12574.0, lr: 0.00001, time: 04m 29s 456ms, eta: 54m 54s 909ms 2020-03-03T14:45:10 INFO: coco:, 38900/40000, train/total_loss: 2.0412 (2.0694), train/caption_cross_entropy: 2.0412 (2.0694), train/caption_bleu4: 0.2653 (0.2658), val/total_loss: 2.2606, val/caption_cross_entropy: 2.2606, val/caption_bleu4: 0.2291, max mem: 12574.0, lr: 0.00001, time: 04m 23s 660ms, eta: 49m 15s 374ms 2020-03-03T14:49:31 INFO: coco:, 39000/40000, train/total_loss: 2.0502 (2.0692), train/caption_cross_entropy: 2.0502 (2.0692), train/caption_bleu4: 0.2631 (0.2658), val/total_loss: 2.3424, val/caption_cross_entropy: 2.3424, val/caption_bleu4: 0.2245, max mem: 12574.0, lr: 0.00001, time: 04m 21s 895ms, eta: 44m 28s 719ms 2020-03-03T14:49:31 INFO: Evaluation time. Running on full validation set... 2020-03-03T14:49:54 INFO: coco: full val:, 39000/40000, val/total_loss: 2.3039, val/caption_cross_entropy: 2.3039, val/caption_bleu4: 0.2282, validation time: 44m 32s 500ms, best iteration: 26000, best val/caption_bleu4: 0.229085 2020-03-03T14:54:12 INFO: coco:, 39100/40000, train/total_loss: 2.0214 (2.0691), train/caption_cross_entropy: 2.0214 (2.0691), train/caption_bleu4: 0.2740 (0.2658), val/total_loss: 2.2716, val/caption_cross_entropy: 2.2716, val/caption_bleu4: 0.2545, max mem: 12574.0, lr: 0.00001, time: 04m 46s 377ms, eta: 43m 46s 367ms 2020-03-03T14:58:39 INFO: coco:, 39200/40000, train/total_loss: 2.0797 (2.0690), train/caption_cross_entropy: 2.0797 (2.0690), train/caption_bleu4: 0.2589 (0.2658), val/total_loss: 2.3045, val/caption_cross_entropy: 2.3045, val/caption_bleu4: 0.2206, max mem: 12574.0, lr: 0.00001, time: 04m 24s 766ms, eta: 35m 58s 376ms 2020-03-03T15:03:09 INFO: coco:, 39300/40000, train/total_loss: 2.0367 (2.0688), train/caption_cross_entropy: 2.0367 (2.0688), train/caption_bleu4: 0.2694 (0.2659), val/total_loss: 2.5071, val/caption_cross_entropy: 2.5071, val/caption_bleu4: 0.1972, max mem: 12574.0, lr: 0.00001, time: 04m 22s 991ms, eta: 31m 15s 918ms 2020-03-03T15:07:45 INFO: coco:, 39400/40000, train/total_loss: 2.0080 (2.0687), train/caption_cross_entropy: 2.0080 (2.0687), train/caption_bleu4: 0.2720 (0.2659), val/total_loss: 2.3165, val/caption_cross_entropy: 2.3165, val/caption_bleu4: 0.1988, max mem: 12574.0, lr: 0.00001, time: 04m 28s 332ms, eta: 27m 20s 587ms 2020-03-03T15:12:16 INFO: coco:, 39500/40000, train/total_loss: 2.0122 (2.0686), train/caption_cross_entropy: 2.0122 (2.0686), train/caption_bleu4: 0.2700 (0.2659), val/total_loss: 2.2104, val/caption_cross_entropy: 2.2104, val/caption_bleu4: 0.2388, max mem: 12574.0, lr: 0.00001, time: 04m 35s 042ms, eta: 23m 21s 342ms 2020-03-03T15:16:40 INFO: coco:, 39600/40000, train/total_loss: 2.0363 (2.0685), train/caption_cross_entropy: 2.0363 (2.0685), train/caption_bleu4: 0.2760 (0.2659), val/total_loss: 2.3512, val/caption_cross_entropy: 2.3512, val/caption_bleu4: 0.2225, max mem: 12574.0, lr: 0.00001, time: 04m 28s 374ms, eta: 18m 13s 895ms 2020-03-03T15:21:02 INFO: coco:, 39700/40000, train/total_loss: 2.0412 (2.0683), train/caption_cross_entropy: 2.0412 (2.0683), train/caption_bleu4: 0.2639 (0.2660), val/total_loss: 2.2810, val/caption_cross_entropy: 2.2810, val/caption_bleu4: 0.2510, max mem: 12574.0, lr: 0.00001, time: 04m 25s 422ms, eta: 13m 31s 398ms 2020-03-03T15:25:33 INFO: coco:, 39800/40000, train/total_loss: 2.0465 (2.0682), train/caption_cross_entropy: 2.0465 (2.0682), train/caption_bleu4: 0.2758 (0.2660), val/total_loss: 2.2901, val/caption_cross_entropy: 2.2901, val/caption_bleu4: 0.2259, max mem: 12574.0, lr: 0.00001, time: 04m 28s 922ms, eta: 09m 08s 064ms 2020-03-03T15:30:09 INFO: coco:, 39900/40000, train/total_loss: 2.0055 (2.0680), train/caption_cross_entropy: 2.0055 (2.0680), train/caption_bleu4: 0.2744 (0.2660), val/total_loss: 2.2428, val/caption_cross_entropy: 2.2428, val/caption_bleu4: 0.2325, max mem: 12574.0, lr: 0.00001, time: 04m 29s 596ms, eta: 04m 34s 718ms 2020-03-03T15:34:40 INFO: coco:, 40000/40000, train/total_loss: 2.0345 (2.0680), train/caption_cross_entropy: 2.0345 (2.0680), train/caption_bleu4: 0.2688 (0.2660), val/total_loss: 2.2882, val/caption_cross_entropy: 2.2882, val/caption_bleu4: 0.2322, max mem: 12574.0, lr: 0.00001, time: 04m 34s 987ms, eta: 2020-03-03T15:34:40 INFO: Evaluation time. Running on full validation set... 2020-03-03T15:35:04 INFO: coco: full val:, 40000/40000, val/total_loss: 2.3048, val/caption_cross_entropy: 2.3048, val/caption_bleu4: 0.2278, validation time: 45m 10s 676ms, best iteration: 26000, best val/caption_bleu4: 0.229085 2020-03-03T15:35:06 INFO: Stepping into final validation check 2020-03-03T15:35:06 INFO: Evaluation time. Running on full validation set... 2020-03-03T15:35:27 INFO: coco: full val:, 40001/40000, val/total_loss: 2.3042, val/caption_cross_entropy: 2.3042, val/caption_bleu4: 0.2280, validation time: 22s 753ms, best iteration: 26000, best val/caption_bleu4: 0.229085 2020-03-03T15:35:28 INFO: Restoring checkpoint 2020-03-03T15:35:36 INFO: Starting inference on test set

0%| | 0/20 [00:00<?, ?it/s]2020-03-03T15:35:56 WARNING: /usr/local/python3/lib/python3.7/site-packages/nltk/translate/bleu_score.py:523: UserWarning: The hypothesis contains 0 counts of 2-gram overlaps. Therefore the BLEU score evaluates to 0, independently of how many N-gram overlaps of lower order it contains. Consider using lower n-gram order or use SmoothingFunction() warnings.warn(_msg)

2020-03-03T15:35:56 WARNING: /usr/local/python3/lib/python3.7/site-packages/nltk/translate/bleu_score.py:523: UserWarning: The hypothesis contains 0 counts of 3-gram overlaps. Therefore the BLEU score evaluates to 0, independently of how many N-gram overlaps of lower order it contains. Consider using lower n-gram order or use SmoothingFunction() warnings.warn(_msg)

2020-03-03T15:35:56 WARNING: /usr/local/python3/lib/python3.7/site-packages/nltk/translate/bleu_score.py:523: UserWarning: The hypothesis contains 0 counts of 4-gram overlaps. Therefore the BLEU score evaluates to 0, independently of how many N-gram overlaps of lower order it contains. Consider using lower n-gram order or use SmoothingFunction() warnings.warn(_msg)

5%|▌ | 1/20 [00:18<05:47, 18.26s/it] 10%|█ | 2/20 [00:21<03:10, 10.60s/it] 15%|█▌ | 3/20 [00:24<02:16, 8.04s/it] 20%|██ | 4/20 [00:27<01:48, 6.76s/it] 25%|██▌ | 5/20 [00:29<01:29, 6.00s/it] 30%|███ | 6/20 [00:32<01:16, 5.48s/it] 35%|███▌ | 7/20 [00:35<01:06, 5.10s/it] 40%|████ | 8/20 [00:38<00:57, 4.83s/it] 45%|████▌ | 9/20 [00:41<00:50, 4.61s/it] 50%|█████ | 10/20 [00:44<00:44, 4.43s/it] 55%|█████▌ | 11/20 [00:47<00:38, 4.29s/it] 60%|██████ | 12/20 [00:50<00:33, 4.17s/it] 65%|██████▌ | 13/20 [00:52<00:28, 4.07s/it] 70%|███████ | 14/20 [00:55<00:23, 3.98s/it] 75%|███████▌ | 15/20 [00:58<00:19, 3.90s/it] 80%|████████ | 16/20 [01:01<00:15, 3.84s/it] 85%|████████▌ | 17/20 [01:04<00:11, 3.78s/it] 90%|█████████ | 18/20 [01:07<00:07, 3.73s/it] 95%|█████████▌| 19/20 [01:09<00:03, 3.68s/it] 100%|██████████| 20/20 [01:11<00:00, 3.58s/it]2020-03-03T15:36:49 INFO: coco: full test:, 40001/40000, test/total_loss: 23.0447, test/caption_cross_entropy: 23.0447, test/caption_bleu4: 0.0021

ruotianluo commented 4 years ago

看起来确实不增长了

huaifeng1993 commented 3 years ago

有没有看下结巴分词的结果。在分词的时候有两种模式一种是全匹配一种是独立的。例如:在足球场上踢足球。全匹配会分成:在 足球 足球场 上 踢 踢足球。。。。所以训练结果会有重复的

huaifeng1993 commented 3 years ago

@cylvzj 这样设置看看 jieba.cut( inputs,cut_all=False)