Open xxlxx1 opened 5 years ago
I have same the problem @huoyijie
我也遇到相同的问题,求大神解惑@huoyijie
Maybe you should refer to the issue #27 first and try to follow the way that the author was doing. I am also facing this problem and currently working on it.
@stillarrow @huoyijie 感谢你们的热心回答。我根据the issue #27的作者操作,使用icdar2015的1000个训练样本,其中900个样本用于train,100个样本用于test,使用默认参数,256——》384——》512——》640——736依次训练和加载上一步的最好模型进行初始化。还是出现loss居高不下(0.7左右)的问题。求助,是哪里出了问题?
cfg.py如下: import os
train_task_id = '3T736' initial_epoch = 20 epoch_num = 40 #24 lr = 1e-3 decay = 5e-4
patience = 5 load_weights = True lambda_inside_score_loss = 4.0 lambda_side_vertex_code_loss = 1.0 lambda_side_vertex_coord_loss = 1.0
total_img = 1000 validation_split_ratio = 0.1 max_train_img_size = int(train_task_id[-3:]) max_predict_img_size = int(train_task_id[-3:]) # 2400 assert max_train_img_size in [256, 384, 512, 640, 736], \ 'max_train_img_size must in [256, 384, 512, 640, 736]' if max_train_img_size == 256: batch_size = 8 elif max_train_img_size == 384: batch_size = 4 elif max_train_img_size == 512: batch_size = 2 else: batch_size = 1 steps_per_epoch = total_img (1 - validation_split_ratio) // batch_size validation_steps = total_img validation_split_ratio // batch_size
data_dir = 'icpr/' origin_image_dir_name = 'image_10000/' origin_txt_dir_name = 'txt_10000/' train_image_dirname = 'images%s/' % train_task_id train_label_dirname = 'labels%s/' % train_task_id show_gt_image_dir_name = 'show_gtimages%s/' % train_task_id show_act_image_dir_name = 'show_actimages%s/' % train_task_id gen_origin_img = True draw_gt_quad = True draw_act_quad = True valfname = 'val%s.txt' % train_task_id trainfname = 'train%s.txt' % train_task_id
shrink_ratio = 0.2
shrink_side_ratio = 0.6 epsilon = 1e-4
num_channels = 3 feature_layers_range = range(5, 1, -1)
feature_layers_num = len(feature_layers_range)
pixel_size = 2 ** feature_layers_range[-1] locked_layers = False
if not os.path.exists('model'): os.mkdir('model') if not os.path.exists('saved_model'): os.mkdir('saved_model')
model_weightspath = 'model/weights%s.{epoch:03d}-{val_loss:.3f}.h5' \ % train_task_id saved_model_file_path = 'saved_model/eastmodel%s.h5' % train_task_id
saved_model_weights_file_path = 'saved_model/weights_3T640.020-0.700.h5'
pixel_threshold = 0.9 side_vertex_pixel_threshold = 0.9 trunc_threshold = 0.1 predict_cut_text_line = False predict_write2txt = True
训练过程如下: Epoch 1/40 112/112 [==============================] - 18s 161ms/step - loss: 1.0258 - val_loss: 0.9977
Epoch 00001: val_loss improved from inf to 0.99771, saving model to model/weights_3T256.001-0.998.h5 Epoch 2/40 112/112 [==============================] - 12s 104ms/step - loss: 0.9163 - val_loss: 0.9435
Epoch 00002: val_loss improved from 0.99771 to 0.94348, saving model to model/weights_3T256.002-0.943.h5 Epoch 3/40 112/112 [==============================] - 12s 103ms/step - loss: 0.8685 - val_loss: 0.9589
Epoch 00003: val_loss did not improve from 0.94348 Epoch 4/40 112/112 [==============================] - 12s 103ms/step - loss: 0.8321 - val_loss: 0.9361
Epoch 00004: val_loss improved from 0.94348 to 0.93609, saving model to model/weights_3T256.004-0.936.h5 Epoch 5/40 112/112 [==============================] - 12s 103ms/step - loss: 0.7869 - val_loss: 0.8753
Epoch 00005: val_loss improved from 0.93609 to 0.87533, saving model to model/weights_3T256.005-0.875.h5 Epoch 6/40 112/112 [==============================] - 12s 103ms/step - loss: 0.7634 - val_loss: 0.9075
Epoch 00006: val_loss did not improve from 0.87533 Epoch 7/40 112/112 [==============================] - 11s 102ms/step - loss: 0.7447 - val_loss: 0.9625
Epoch 00007: val_loss did not improve from 0.87533 Epoch 8/40 112/112 [==============================] - 12s 103ms/step - loss: 0.6945 - val_loss: 0.9738
Epoch 00008: val_loss did not improve from 0.87533 Epoch 9/40 112/112 [==============================] - 12s 103ms/step - loss: 0.6495 - val_loss: 0.9505
Epoch 00009: val_loss did not improve from 0.87533 Epoch 10/40 112/112 [==============================] - 12s 103ms/step - loss: 0.6376 - val_loss: 0.9441
Epoch 00010: val_loss did not improve from 0.87533 Epoch 00010: early stopping Epoch 6/40 225/225 [==============================] - 31s 138ms/step - loss: 0.8283 - val_loss: 0.8632
Epoch 00006: val_loss improved from inf to 0.86316, saving model to model/weights_3T384.006-0.863.h5 Epoch 7/40 225/225 [==============================] - 26s 114ms/step - loss: 0.7723 - val_loss: 0.8328
Epoch 00007: val_loss improved from 0.86316 to 0.83276, saving model to model/weights_3T384.007-0.833.h5 Epoch 8/40 225/225 [==============================] - 26s 115ms/step - loss: 0.7441 - val_loss: 0.8024
Epoch 00008: val_loss improved from 0.83276 to 0.80241, saving model to model/weights_3T384.008-0.802.h5 Epoch 9/40 225/225 [==============================] - 26s 114ms/step - loss: 0.6963 - val_loss: 0.8008
Epoch 00009: val_loss improved from 0.80241 to 0.80084, saving model to model/weights_3T384.009-0.801.h5 Epoch 10/40 225/225 [==============================] - 26s 113ms/step - loss: 0.6602 - val_loss: 0.8203
Epoch 00010: val_loss did not improve from 0.80084 Epoch 11/40 225/225 [==============================] - 26s 113ms/step - loss: 0.6240 - val_loss: 0.8538
Epoch 00011: val_loss did not improve from 0.80084 Epoch 12/40 225/225 [==============================] - 26s 114ms/step - loss: 0.5770 - val_loss: 0.8294
Epoch 00012: val_loss did not improve from 0.80084 Epoch 13/40 225/225 [==============================] - 26s 115ms/step - loss: 0.5657 - val_loss: 0.9329
Epoch 00013: val_loss did not improve from 0.80084 Epoch 14/40 225/225 [==============================] - 26s 115ms/step - loss: 0.5337 - val_loss: 0.8109
Epoch 00014: val_loss did not improve from 0.80084 Epoch 00014: early stopping Epoch 6/40 225/225 [==============================] - 31s 138ms/step - loss: 0.8283 - val_loss: 0.8632
Epoch 00006: val_loss improved from inf to 0.86316, saving model to model/weights_3T384.006-0.863.h5 Epoch 7/40 225/225 [==============================] - 26s 114ms/step - loss: 0.7723 - val_loss: 0.8328
Epoch 00007: val_loss improved from 0.86316 to 0.83276, saving model to model/weights_3T384.007-0.833.h5 Epoch 8/40 225/225 [==============================] - 26s 115ms/step - loss: 0.7441 - val_loss: 0.8024
Epoch 00008: val_loss improved from 0.83276 to 0.80241, saving model to model/weights_3T384.008-0.802.h5 Epoch 9/40 225/225 [==============================] - 26s 114ms/step - loss: 0.6963 - val_loss: 0.8008
Epoch 00009: val_loss improved from 0.80241 to 0.80084, saving model to model/weights_3T384.009-0.801.h5 Epoch 10/40 225/225 [==============================] - 26s 113ms/step - loss: 0.6602 - val_loss: 0.8203
Epoch 00010: val_loss did not improve from 0.80084 Epoch 11/40 225/225 [==============================] - 26s 113ms/step - loss: 0.6240 - val_loss: 0.8538
Epoch 00011: val_loss did not improve from 0.80084 Epoch 12/40 225/225 [==============================] - 26s 114ms/step - loss: 0.5770 - val_loss: 0.8294
Epoch 00012: val_loss did not improve from 0.80084 Epoch 13/40 225/225 [==============================] - 26s 115ms/step - loss: 0.5657 - val_loss: 0.9329
Epoch 00013: val_loss did not improve from 0.80084 Epoch 14/40 225/225 [==============================] - 26s 115ms/step - loss: 0.5337 - val_loss: 0.8109
Epoch 00014: val_loss did not improve from 0.80084 Epoch 00014: early stopping Epoch 10/40 450/450 [==============================] - 52s 116ms/step - loss: 0.7323 - val_loss: 0.8008
Epoch 00010: val_loss improved from inf to 0.80084, saving model to model/weights_3T512.010-0.801.h5 Epoch 11/40 450/450 [==============================] - 47s 105ms/step - loss: 0.6656 - val_loss: 0.7684
Epoch 00011: val_loss improved from 0.80084 to 0.76837, saving model to model/weights_3T512.011-0.768.h5 Epoch 12/40 450/450 [==============================] - 47s 105ms/step - loss: 0.6215 - val_loss: 0.7373
Epoch 00012: val_loss improved from 0.76837 to 0.73733, saving model to model/weights_3T512.012-0.737.h5 Epoch 13/40 450/450 [==============================] - 47s 105ms/step - loss: 0.6045 - val_loss: 0.7435
Epoch 00013: val_loss did not improve from 0.73733 Epoch 14/40 450/450 [==============================] - 47s 104ms/step - loss: 0.5627 - val_loss: 0.7855
Epoch 00014: val_loss did not improve from 0.73733 Epoch 15/40 450/450 [==============================] - 47s 104ms/step - loss: 0.5516 - val_loss: 0.7778
Epoch 00015: val_loss did not improve from 0.73733 Epoch 16/40 450/450 [==============================] - 47s 104ms/step - loss: 0.5046 - val_loss: 0.7344
Epoch 00016: val_loss improved from 0.73733 to 0.73437, saving model to model/weights_3T512.016-0.734.h5 Epoch 17/40 450/450 [==============================] - 47s 104ms/step - loss: 0.4785 - val_loss: 0.7575
Epoch 00017: val_loss did not improve from 0.73437 Epoch 18/40 450/450 [==============================] - 47s 104ms/step - loss: 0.4503 - val_loss: 0.8144
Epoch 00018: val_loss did not improve from 0.73437 Epoch 19/40 450/450 [==============================] - 47s 104ms/step - loss: 0.4179 - val_loss: 0.8738
Epoch 00019: val_loss did not improve from 0.73437 Epoch 20/40 450/450 [==============================] - 47s 104ms/step - loss: 0.4001 - val_loss: 0.8318
Epoch 00020: val_loss did not improve from 0.73437 Epoch 21/40 450/450 [==============================] - 47s 104ms/step - loss: 0.3776 - val_loss: 0.8361
Epoch 00021: val_loss did not improve from 0.73437 Epoch 00021: early stopping Epoch 17/40 900/900 [==============================] - 87s 97ms/step - loss: 0.6340 - val_loss: 0.8157
Epoch 00017: val_loss improved from inf to 0.81570, saving model to model/weights_3T640.017-0.816.h5 Epoch 18/40 900/900 [==============================] - 81s 90ms/step - loss: 0.5578 - val_loss: 0.7535
Epoch 00018: val_loss improved from 0.81570 to 0.75353, saving model to model/weights_3T640.018-0.754.h5 Epoch 19/40 900/900 [==============================] - 81s 90ms/step - loss: 0.5180 - val_loss: 0.7031
Epoch 00019: val_loss improved from 0.75353 to 0.70313, saving model to model/weights_3T640.019-0.703.h5 Epoch 20/40 900/900 [==============================] - 81s 91ms/step - loss: 0.4704 - val_loss: 0.6999
Epoch 00020: val_loss improved from 0.70313 to 0.69989, saving model to model/weights_3T640.020-0.700.h5 Epoch 21/40 900/900 [==============================] - 81s 90ms/step - loss: 0.4337 - val_loss: 0.8117
Epoch 00021: val_loss did not improve from 0.69989 Epoch 22/40 900/900 [==============================] - 81s 90ms/step - loss: 0.3875 - val_loss: 0.8635
Epoch 00022: val_loss did not improve from 0.69989 Epoch 23/40 900/900 [==============================] - 81s 90ms/step - loss: 0.3774 - val_loss: 0.8113
Epoch 00023: val_loss did not improve from 0.69989 Epoch 24/40 900/900 [==============================] - 81s 90ms/step - loss: 0.3461 - val_loss: 0.7961
Epoch 00024: val_loss did not improve from 0.69989 Epoch 25/40 900/900 [==============================] - 81s 90ms/step - loss: 0.3265 - val_loss: 0.8327
Epoch 00025: val_loss did not improve from 0.69989 Epoch 00025: early stopping Epoch 21/40 900/900 [==============================] - 107s 119ms/step - loss: 0.5786 - val_loss: 0.6918
Epoch 00021: val_loss improved from inf to 0.69179, saving model to model/weights_3T736.021-0.692.h5 Epoch 22/40 900/900 [==============================] - 100s 111ms/step - loss: 0.5387 - val_loss: 0.7834
Epoch 00022: val_loss did not improve from 0.69179 Epoch 23/40 900/900 [==============================] - 101s 112ms/step - loss: 0.4902 - val_loss: 0.7386
Epoch 00023: val_loss did not improve from 0.69179 Epoch 24/40 900/900 [==============================] - 101s 112ms/step - loss: 0.4416 - val_loss: 0.8202
Epoch 00024: val_loss did not improve from 0.69179 Epoch 25/40 900/900 [==============================] - 101s 112ms/step - loss: 0.4116 - val_loss: 0.7364
Epoch 00025: val_loss did not improve from 0.69179 Epoch 26/40 900/900 [==============================] - 101s 112ms/step - loss: 0.3863 - val_loss: 0.9104
Epoch 00026: val_loss did not improve from 0.69179 Epoch 00026: early stopping 中间模型如下:
@lijian10086 It seems that you did not change the parameters in the config file (well, except for the epoch number) but could not reach an eval loss under 0.7. Actually I once did the same experiment on icdar2015 dataset (size 1000) but could reach a eval loss around 0.5, seems a little bit weird..... I think its might be kind of overfitting. Currently working on OCR using models including this repo, maybe I could share with you the solution in the next few weeks.
@stillarrow 哈哈,终于等到你的回答了。我通过降低lr在736模型下继续训练,最多只能valloss最多只能下到0.64。valloss在0.5左右估计也是远远不够的吧,要达到论文那种0.8F1,估计val要0.1以内吧,你觉得呢?你之前的回答中“ I am also facing this problem and currently working on it.”,是指使用作者的tianchi ICPR dataset 10000样本训练没问题是么?PS:我最近也是在做OCR,能否加我微信,在微信方便即时交流(我的微信号:18024583839)。
请问你最后valloss训练到多少了?我训练了256,现在在训练387的时候是这样的 虽然也不太好 我用的数据集是icdar2017
@xiiiiiiii 请问你是怎么训练的呢?我用作者提供的那10000个样本的数据集进行训练,先建了icpr目录,然后建image_10000和txt_10000,其他参数没有改先训练了256,出现了early stop.请问你一开始直接训练256的时候有对代码做什么修改吗
我也是这样训练的,early stop是正常现象 防止过拟合,你的训练效果怎么样
---原始邮件--- 发件人: "wsy915"<notifications@github.com> 发送时间: 2019年10月21日(星期一) 上午10:26 收件人: "huoyijie/AdvancedEAST"<AdvancedEAST@noreply.github.com>; 抄送: "Mention"<mention@noreply.github.com>;"guopeijun"<494207346@qq.com>; 主题: Re: [huoyijie/AdvancedEAST] icdar2015上训练效果 (#70)
@xiiiiiiii 请问你是怎么训练的呢?我用作者提供的那10000个样本的数据集进行训练,先建了icpr目录,然后建image_10000和txt_10000,其他参数没有改先训练了256,出现了early stop.请问你一开始直接训练256的时候有对代码做什么修改吗
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
我才刚开始训练,才训练到256。然后在第7个epoch就early stop了。你继续训练的时候就是按照作者27的回答改cfg.py里的参数实现的吗?你现在效果理想吗?你如果val_loss一直不降低的话是不是过拟合了?样本数量太少?
------------------ 原始邮件 ------------------ 发件人: "guopeijun"<notifications@github.com>; 发送时间: 2019年10月21日(星期一) 上午10:27 收件人: "huoyijie/AdvancedEAST"<AdvancedEAST@noreply.github.com>; 抄送: "王思怡"<785134974@qq.com>; "Comment"<comment@noreply.github.com>; 主题: Re: [huoyijie/AdvancedEAST] icdar2015上训练效果 (#70)
我也是这样训练的,early stop是正常现象 防止过拟合,你的训练效果怎么样
---原始邮件--- 发件人: "wsy915"<notifications@github.com> 发送时间: 2019年10月21日(星期一) 上午10:26 收件人: "huoyijie/AdvancedEAST"<AdvancedEAST@noreply.github.com>; 抄送: "Mention"<mention@noreply.github.com>;"guopeijun"<494207346@qq.com>; 主题: Re: [huoyijie/AdvancedEAST] icdar2015上训练效果 (#70)
@xiiiiiiii 请问你是怎么训练的呢?我用作者提供的那10000个样本的数据集进行训练,先建了icpr目录,然后建image_10000和txt_10000,其他参数没有改先训练了256,出现了early stop.请问你一开始直接训练256的时候有对代码做什么修改吗
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
对,我只用了3k个数据,效果还不错,512是效果最好的时候,之后训练效果也不咋样了,如果1w个数据可能更好吧
---原始邮件--- 发件人: "wsy915"<notifications@github.com> 发送时间: 2019年10月21日(星期一) 上午10:39 收件人: "huoyijie/AdvancedEAST"<AdvancedEAST@noreply.github.com>; 抄送: "Mention"<mention@noreply.github.com>;"guopeijun"<494207346@qq.com>; 主题: Re: [huoyijie/AdvancedEAST] icdar2015上训练效果 (#70)
我才刚开始训练,才训练到256。然后在第7个epoch就early stop了。你继续训练的时候就是按照作者27的回答改cfg.py里的参数实现的吗?你现在效果理想吗?你如果val_loss一直不降低的话是不是过拟合了?样本数量太少?
------------------ 原始邮件 ------------------
发件人: "guopeijun"<notifications@github.com>;
发送时间: 2019年10月21日(星期一) 上午10:27
收件人: "huoyijie/AdvancedEAST"<AdvancedEAST@noreply.github.com>;
抄送: "王思怡"<785134974@qq.com>; "Comment"<comment@noreply.github.com>;
主题: Re: [huoyijie/AdvancedEAST] icdar2015上训练效果 (#70)
我也是这样训练的,early stop是正常现象 防止过拟合,你的训练效果怎么样
---原始邮件---
发件人: "wsy915"<notifications@github.com&gt;
发送时间: 2019年10月21日(星期一) 上午10:26
收件人: "huoyijie/AdvancedEAST"<AdvancedEAST@noreply.github.com&gt;;
抄送: "Mention"<mention@noreply.github.com&gt;;"guopeijun"<494207346@qq.com&gt;;
主题: Re: [huoyijie/AdvancedEAST] icdar2015上训练效果 (#70)
@xiiiiiiii 请问你是怎么训练的呢?我用作者提供的那10000个样本的数据集进行训练,先建了icpr目录,然后建image_10000和txt_10000,其他参数没有改先训练了256,出现了early stop.请问你一开始直接训练256的时候有对代码做什么修改吗
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
那请问你在下一次训练的时候的initial_epoch是设置为上次stop的次数之后,还是设置为最后一次保存模型的epoch之后?像我这个,应该设为13吗,还是8?
发自我的iPhone
------------------ 原始邮件 ------------------ 发件人: guopeijun <notifications@github.com> 发送时间: 2019年10月21日 10:41 收件人: huoyijie/AdvancedEAST <AdvancedEAST@noreply.github.com> 抄送: wsy915 <785134974@qq.com>, Comment <comment@noreply.github.com> 主题: 回复:[huoyijie/AdvancedEAST] icdar2015上训练效果 (#70)
对,我只用了3k个数据,效果还不错,512是效果最好的时候,之后训练效果也不咋样了,如果1w个数据可能更好吧
---原始邮件--- 发件人: "wsy915"<notifications@github.com> 发送时间: 2019年10月21日(星期一) 上午10:39 收件人: "huoyijie/AdvancedEAST"<AdvancedEAST@noreply.github.com>; 抄送: "Mention"<mention@noreply.github.com>;"guopeijun"<494207346@qq.com>; 主题: Re: [huoyijie/AdvancedEAST] icdar2015上训练效果 (#70)
我才刚开始训练,才训练到256。然后在第7个epoch就early stop了。你继续训练的时候就是按照作者27的回答改cfg.py里的参数实现的吗?你现在效果理想吗?你如果val_loss一直不降低的话是不是过拟合了?样本数量太少?
------------------&nbsp;原始邮件&nbsp;------------------
发件人: "guopeijun"<notifications@github.com&gt;;
发送时间: 2019年10月21日(星期一) 上午10:27
收件人: "huoyijie/AdvancedEAST"<AdvancedEAST@noreply.github.com&gt;;
抄送: "王思怡"<785134974@qq.com&gt;; "Comment"<comment@noreply.github.com&gt;;
主题: Re: [huoyijie/AdvancedEAST] icdar2015上训练效果 (#70)
我也是这样训练的,early stop是正常现象 防止过拟合,你的训练效果怎么样
---原始邮件---
发件人: "wsy915"<notifications@github.com&amp;gt;
发送时间: 2019年10月21日(星期一) 上午10:26
收件人: "huoyijie/AdvancedEAST"<AdvancedEAST@noreply.github.com&amp;gt;;
抄送: "Mention"<mention@noreply.github.com&amp;gt;;"guopeijun"<494207346@qq.com&amp;gt;;
主题: Re: [huoyijie/AdvancedEAST] icdar2015上训练效果 (#70)
@xiiiiiiii 请问你是怎么训练的呢?我用作者提供的那10000个样本的数据集进行训练,先建了icpr目录,然后建image_10000和txt_10000,其他参数没有改先训练了256,出现了early stop.请问你一开始直接训练256的时候有对代码做什么修改吗
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.
都可以,这个只是保存模型的名称会不一样而已
---原始邮件--- 发件人: "wsy915"<notifications@github.com> 发送时间: 2019年10月21日(星期一) 中午11:06 收件人: "huoyijie/AdvancedEAST"<AdvancedEAST@noreply.github.com>; 抄送: "Mention"<mention@noreply.github.com>;"guopeijun"<494207346@qq.com>; 主题: Re: [huoyijie/AdvancedEAST] icdar2015上训练效果 (#70)
那请问你在下一次训练的时候的initial_epoch是设置为上次stop的次数之后,还是设置为最后一次保存模型的epoch之后?像我这个,应该设为13吗,还是8?
发自我的iPhone
------------------ 原始邮件 ------------------ 发件人: guopeijun <notifications@github.com> 发送时间: 2019年10月21日 10:41 收件人: huoyijie/AdvancedEAST <AdvancedEAST@noreply.github.com> 抄送: wsy915 <785134974@qq.com>, Comment <comment@noreply.github.com> 主题: 回复:[huoyijie/AdvancedEAST] icdar2015上训练效果 (#70)
对,我只用了3k个数据,效果还不错,512是效果最好的时候,之后训练效果也不咋样了,如果1w个数据可能更好吧
---原始邮件---
发件人: "wsy915"<notifications@github.com&gt;
发送时间: 2019年10月21日(星期一) 上午10:39
收件人: "huoyijie/AdvancedEAST"<AdvancedEAST@noreply.github.com&gt;;
抄送: "Mention"<mention@noreply.github.com&gt;;"guopeijun"<494207346@qq.com&gt;;
主题: Re: [huoyijie/AdvancedEAST] icdar2015上训练效果 (#70)
我才刚开始训练,才训练到256。然后在第7个epoch就early stop了。你继续训练的时候就是按照作者27的回答改cfg.py里的参数实现的吗?你现在效果理想吗?你如果val_loss一直不降低的话是不是过拟合了?样本数量太少?
------------------&amp;nbsp;原始邮件&amp;nbsp;------------------
发件人: "guopeijun"<notifications@github.com&amp;gt;;
发送时间: 2019年10月21日(星期一) 上午10:27
收件人: "huoyijie/AdvancedEAST"<AdvancedEAST@noreply.github.com&amp;gt;;
抄送: "王思怡"<785134974@qq.com&amp;gt;; "Comment"<comment@noreply.github.com&amp;gt;;
主题: Re: [huoyijie/AdvancedEAST] icdar2015上训练效果 (#70)
我也是这样训练的,early stop是正常现象 防止过拟合,你的训练效果怎么样
---原始邮件---
发件人: "wsy915"<notifications@github.com&amp;amp;amp;gt;
发送时间: 2019年10月21日(星期一) 上午10:26
收件人: "huoyijie/AdvancedEAST"<AdvancedEAST@noreply.github.com&amp;amp;amp;gt;;
抄送: "Mention"<mention@noreply.github.com&amp;amp;amp;gt;;"guopeijun"<494207346@qq.com&amp;amp;amp;gt;;
主题: Re: [huoyijie/AdvancedEAST] icdar2015上训练效果 (#70)
@xiiiiiiii 请问你是怎么训练的呢?我用作者提供的那10000个样本的数据集进行训练,先建了icpr目录,然后建image_10000和txt_10000,其他参数没有改先训练了256,出现了early stop.请问你一开始直接训练256的时候有对代码做什么修改吗
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
你好,请问你的val_loss最后是多少呢?我训练736的时候,依然只跑了14个epoch,val_loss只到0.19。
发自我的iPhone
------------------ 原始邮件 ------------------ 发件人: guopeijun <notifications@github.com> 发送时间: 2019年10月21日 11:07 收件人: huoyijie/AdvancedEAST <AdvancedEAST@noreply.github.com> 抄送: wsy915 <785134974@qq.com>, Comment <comment@noreply.github.com> 主题: 回复:[huoyijie/AdvancedEAST] icdar2015上训练效果 (#70)
都可以,这个只是保存模型的名称会不一样而已
---原始邮件--- 发件人: "wsy915"<notifications@github.com> 发送时间: 2019年10月21日(星期一) 中午11:06 收件人: "huoyijie/AdvancedEAST"<AdvancedEAST@noreply.github.com>; 抄送: "Mention"<mention@noreply.github.com>;"guopeijun"<494207346@qq.com>; 主题: Re: [huoyijie/AdvancedEAST] icdar2015上训练效果 (#70)
那请问你在下一次训练的时候的initial_epoch是设置为上次stop的次数之后,还是设置为最后一次保存模型的epoch之后?像我这个,应该设为13吗,还是8?
发自我的iPhone
------------------ 原始邮件 ------------------
发件人: guopeijun <notifications@github.com&gt;
发送时间: 2019年10月21日 10:41
收件人: huoyijie/AdvancedEAST <AdvancedEAST@noreply.github.com&gt;
抄送: wsy915 <785134974@qq.com&gt;, Comment <comment@noreply.github.com&gt;
主题: 回复:[huoyijie/AdvancedEAST] icdar2015上训练效果 (#70)
对,我只用了3k个数据,效果还不错,512是效果最好的时候,之后训练效果也不咋样了,如果1w个数据可能更好吧
---原始邮件---
发件人: "wsy915"<notifications@github.com&amp;gt;
发送时间: 2019年10月21日(星期一) 上午10:39
收件人: "huoyijie/AdvancedEAST"<AdvancedEAST@noreply.github.com&amp;gt;;
抄送: "Mention"<mention@noreply.github.com&amp;gt;;"guopeijun"<494207346@qq.com&amp;gt;;
主题: Re: [huoyijie/AdvancedEAST] icdar2015上训练效果 (#70)
我才刚开始训练,才训练到256。然后在第7个epoch就early stop了。你继续训练的时候就是按照作者27的回答改cfg.py里的参数实现的吗?你现在效果理想吗?你如果val_loss一直不降低的话是不是过拟合了?样本数量太少?
------------------&amp;amp;amp;nbsp;原始邮件&amp;amp;amp;nbsp;------------------
发件人: "guopeijun"<notifications@github.com&amp;amp;amp;gt;;
发送时间: 2019年10月21日(星期一) 上午10:27
收件人: "huoyijie/AdvancedEAST"<AdvancedEAST@noreply.github.com&amp;amp;amp;gt;;
抄送: "王思怡"<785134974@qq.com&amp;amp;amp;gt;; "Comment"<comment@noreply.github.com&amp;amp;amp;gt;;
主题: Re: [huoyijie/AdvancedEAST] icdar2015上训练效果 (#70)
我也是这样训练的,early stop是正常现象 防止过拟合,你的训练效果怎么样
---原始邮件---
发件人: "wsy915"<notifications@github.com&amp;amp;amp;amp;gt;
发送时间: 2019年10月21日(星期一) 上午10:26
收件人: "huoyijie/AdvancedEAST"<AdvancedEAST@noreply.github.com&amp;amp;amp;amp;gt;;
抄送: "Mention"<mention@noreply.github.com&amp;amp;amp;amp;gt;;"guopeijun"<494207346@qq.com&amp;amp;amp;amp;gt;;
主题: Re: [huoyijie/AdvancedEAST] icdar2015上训练效果 (#70)
@xiiiiiiii 请问你是怎么训练的呢?我用作者提供的那10000个样本的数据集进行训练,先建了icpr目录,然后建image_10000和txt_10000,其他参数没有改先训练了256,出现了early stop.请问你一开始直接训练256的时候有对代码做什么修改吗
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.
@wsy915 你好,你在什么数据上面训练对,最终loss到了多少,我对loss最后知道0.19
我是在自己的数据集加上2017rctw上训练的。最后的val_loss大概0.10
发自我的iPhone
------------------ 原始邮件 ------------------ 发件人: Jason Lee <notifications@github.com> 发送时间: 2019年12月16日 12:10 收件人: huoyijie/AdvancedEAST <AdvancedEAST@noreply.github.com> 抄送: wsy915 <785134974@qq.com>, Mention <mention@noreply.github.com> 主题: 回复:[huoyijie/AdvancedEAST] icdar2015上训练效果 (#70)
@wsy915 你好,你在什么数据上面训练对,最终loss到了多少,我对loss最后知道0.19
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
使用icdar2015数据来训练,参数使用作者默认的,为什么loss都很高呢,只训练256大小的 Epoch 1/24 1125/1125 [==============================] - 131s 117ms/step - loss: 1.0463 - val_loss: 1.1207
Epoch 00001: val_loss improved from inf to 1.12069, saving model to model/weights_3T256.001-1.121.h5 Epoch 2/24 1125/1125 [==============================] - 126s 112ms/step - loss: 0.8477 - val_loss: 1.1539
Epoch 00002: val_loss did not improve Epoch 3/24 1125/1125 [==============================] - 126s 112ms/step - loss: 0.7191 - val_loss: 1.1873
Epoch 00003: val_loss did not improve Epoch 4/24 1125/1125 [==============================] - 126s 112ms/step - loss: 0.6319 - val_loss: 1.2241
Epoch 00004: val_loss did not improve Epoch 5/24 1125/1125 [==============================] - 126s 112ms/step - loss: 0.5608 - val_loss: 1.2758
Epoch 00005: val_loss did not improve Epoch 6/24 1125/1125 [==============================] - 126s 112ms/step - loss: 0.5107 - val_loss: 1.2994
Epoch 00006: val_loss did not improve Epoch 00006: early stopping