whywhs / Pytorch-Handwritten-Mathematical-Expression-Recognition

This program uses Attention and Coverage to realize HMER and this program is based on Pytorch.
MIT License
214 stars 79 forks source link

sacc so small! #11

Open 13354236170 opened 4 years ago

13354236170 commented 4 years ago

你好,我按照指定的步骤,完成了数据的生成,但是在训练的过程中,完全不收敛,sacc 为个位数,目前已经训练到100+ epoch。能帮忙看一下是什么问题吗? sacc

whywhs commented 4 years ago

你好,之所以sacc是个位数主要我看你的loss一直很大,15点多,正常训练下来没有这么大,你可以看下我之前跑的实验结果 test_right_SGD_bs8_te1_mask_conv_bn_b_dr05.txt 至于为什么你这个完全没有收敛,我看你batch是20,有可能因为你这个batch大了,因为图片大小不一样的关系,batch取大了之后对结果有一定影响,但这方面实验我没有做过。所以,建议你可以先按照我Train.py中的参数来设置,应该就不会有这个问题了。

macqueen09 commented 4 years ago

@whywhs hello , your work are very greatful . Very useful for me. Thanks. when I use your model (under model/)

encoder_lr0.00001_GN_te1_d05_SGD_bs6_mask_conv_bn_b_xavier.pkl
attn_decoder_lr0.00001_GN_te1_d05_SGD_bs6_mask_conv_bn_b_xavier.pkl

(and I found you delete original encoder_lr0.00001_BN_te1_d05_SGD_bs8_mask_conv_bn_b.pkl) and run Densenet_testway.py in your test dataset it was

wer is 0.17584
sacc is 0.36216 

it not very very high and there are so much sample like this

prediction is 
['[', 'E', ']', ']']
the truth is
['[', '[', 'S', ']', ']']
the wer is 0.33333

is it correct? Waiting for your reply, thanks very much.

whywhs commented 4 years ago

@whywhs hello , your work are very greatful . Very useful for me. Thanks. when I use your model (under model/)

encoder_lr0.00001_GN_te1_d05_SGD_bs6_mask_conv_bn_b_xavier.pkl
attn_decoder_lr0.00001_GN_te1_d05_SGD_bs6_mask_conv_bn_b_xavier.pkl

(and I found you delete original encoder_lr0.00001_BN_te1_d05_SGD_bs8_mask_conv_bn_b.pkl) and run Densenet_testway.py in your test dataset it was

wer is 0.17584
sacc is 0.36216 

it not very very high and there are so much sample like this

prediction is 
['[', 'E', ']', ']']
the truth is
['[', '[', 'S', ']', ']']
the wer is 0.33333

is it correct? Waiting for your reply, thanks very much.

Hi, I'm very happy for helping you. First of all, The sacc and wer in my model are testing in <the batch_size is 6, the max len is 48 and the max Image size is 100000>, you can check your parameters. Then, I suggest that you can improve the performance of this model by adding regularization(Dropout or L1,L2).