da03 / Attention-OCR

Visual Attention based OCR
MIT License
1.12k stars 363 forks source link

Question about training procedure #27

Closed chenmulin closed 7 years ago

chenmulin commented 7 years ago

I train the model on my own dataset, which contains 10k plate images. When I set the batch_size as 256, the step perplexity reaches to 1.001, however, the step perplexity of the trained model increases to 10 if the batch_size is set to 2. If I fix the batch_size to 256, does the model really converge? I wonder that a large batch_size is not appropriate.

chenmulin commented 7 years ago

In addition, I have another question. I found the losses of the same data are different for the training and stages. So confused! Looking forward to an answer, thanks!

ddaue commented 7 years ago

I solved it with the following modification in model.py, line 352, without any explanations of why it should be like that... I searched a lot...

if not forward_only:

if True: input_feed[K.learning_phase()] = 1 else: input_feed[K.learning_phase()] = 0

chenmulin commented 7 years ago

It works! Without your help, I will never find out this single detail! Thank you a lot! Very grateful for you help and patience!


Best Wishes, Mulin Chen

2017-02-23 2:09 GMT+08:00 ddaue notifications@github.com:

I solved it with the following modification in model.py, line 352, without any explanations of why it should be like that... I searched a lot...

if not forward_only:

if True: input_feed[K.learning_phase()] = 1 else: input_feed[K.learning_phase()] = 0

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/da03/Attention-OCR/issues/27#issuecomment-281752535, or mute the thread https://github.com/notifications/unsubscribe-auth/ANbwCAcPYD-he0_xgdvffdsyjJoZwhecks5rfHnwgaJpZM4Lgm4f .