Bartzi / stn-ocr

Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition
https://arxiv.org/abs/1707.08831
GNU General Public License v3.0
499 stars 137 forks source link

train accuracy on svhn dataset doesn't improve #18

Open zolatoi opened 6 years ago

zolatoi commented 6 years ago

Dear Bartzi,

thank you for the great job you did with STN-OCR. I implemented all the steps you described and I launched the learning with train_svhn script but I observe that after 90 epochs the train-accuracy doesn't improve (always between 0.26) and the train-loss is around 2.12. I don't know what happened and how to have better performance. Please find below the command line I used:

python3 train_svhn.py ../datasets/svhn/generated/centered/train.csv ../datasets/svhn/generated/centered/valid.csv --gpus 0,1 --log-dir ./logs --save-model-prefix svhn_train_model -b 100 --lr 1e-5 --zoom 0.5 -ci 500 --char-map ../datasets/svhn/svhn_char_map.json

Best Regards,

Bartzi commented 6 years ago

The command you used looks good to me!

During our experiments we determined, that it is only possible to create a good model, if you train it with several steps. This means that you first train the model until it is not getting better any more. Then you throw away the recognition part of the model and retrain the model, but you initialize the localization part with your already trained weights and the recognition part with random weights. You can do this with our code, too... we, unfortunately, did not add a command line switch for that now, but you can uncomment the following line and comment the next line and then the code will only use the pre-trained weights o the localization network, if you are trying to restart the training with the model you should have now.

Hope it helps ;)

zolatoi commented 6 years ago

Dear Bartzi, thank you for your answer but it is confuse for me.

  1. I uncommented the line 30 and comment the line 31 of spn_initializer.py but when I launch the same command it creates another training repository and I get the same problems with the accuracy. So did you have a specific readme for this ?

  2. Another question about the model of text_recognition datasets, do you have a mode-params file learnt with more than 2 epochs ? 10 or 40 epochs per example ? it will be helpfull to make a test ?

Best regards

Bartzi commented 6 years ago
  1. If you issue the command python train_svhn.py -h you can see all possible command-line arguments. One is --model-prefix. You can use this argument to point the script towards the already trained model. If you are not doing this the training will start from scratch. Right now this is not in the README, but I hope I can find some time to make the according changes to the repository to make it clearer.
  2. We only provided a model trained on two epochs, because of two facts:
    1. Training one epoch takes around two days with the dataset we used
    2. We could not see any (notable) improvement after two epochs, that would justify blocking the GPU even further. This is why we only have a model learnt until two epochs. Two epochs already means that the model has done more than 250000 iterations.
dmandreev commented 6 years ago

Hello! I Thank you for your great work. I just started training svhn and it quickly increase training accuracy from 0.25 to 0.5 with 3 epoch, but validation accuracy increased only from 0.240754 to 0.256260. Does it mean that I also need to remove recognition part and restart training?

2018-01-13 20:33:11,621 Node[0] Start training with [gpu(0), gpu(1), gpu(2), gpu(3)]
2018-01-13 20:37:09,942 Node[0] Epoch[0] Batch [50]     Speed: 191.05 samples/sec       Accuracy=0.252776       Loss=2.146608
2018-01-13 20:40:45,763 Node[0] Epoch[0] Batch [100]    Speed: 192.75 samples/sec       Accuracy=0.288197       Loss=2.048709
2018-01-13 20:42:15,330 Node[0] Epoch[0] Time cost=532.866
2018-01-13 20:42:36,612 Node[0] Epoch[0] Validation-Accuracy=0.240754
2018-01-13 20:42:36,612 Node[0] Epoch[0] Validation-Loss=2.323556
2018-01-13 20:42:36,628 Node[0] Epoch[1] Resetting Data Iterator
2018-01-13 20:46:07,522 Node[0] Epoch[1] Batch [50]     Speed: 201.22 samples/sec       Accuracy=0.351358       Loss=1.865221
2018-01-13 20:49:40,314 Node[0] Epoch[1] Batch [100]    Speed: 195.50 samples/sec       Accuracy=0.440517       Loss=1.624922
2018-01-13 20:51:05,312 Node[0] Epoch[1] Resetting Data Iterator
2018-01-13 20:51:09,484 Node[0] Epoch[1] Time cost=512.856
2018-01-13 20:51:28,030 Node[0] Epoch[1] Validation-Accuracy=0.241386
2018-01-13 20:51:28,030 Node[0] Epoch[1] Validation-Loss=2.376920
2018-01-13 20:55:00,555 Node[0] Epoch[2] Batch [50]     Speed: 199.63 samples/sec       Accuracy=0.514808       Loss=1.432405
2018-01-13 20:58:33,636 Node[0] Epoch[2] Batch [100]    Speed: 195.23 samples/sec       Accuracy=0.582115       Loss=1.241409
2018-01-13 20:59:54,390 Node[0] Epoch[2] Resetting Data Iterator
2018-01-13 21:00:02,718 Node[0] Epoch[2] Time cost=514.688
2018-01-13 21:00:21,812 Node[0] Epoch[2] Validation-Accuracy=0.256260
2018-01-13 21:00:21,812 Node[0] Epoch[2] Validation-Loss=2.201099
2018-01-13 21:03:54,455 Node[0] Epoch[3] Batch [50]     Speed: 199.50 samples/sec       Accuracy=0.440337       Loss=1.651764
Bartzi commented 6 years ago

I don't really think that this is the same problem... Did you stop the training after two epochs? Have you had a look at the train progress on a validation sample, using the BBoxPlotter? Do the BBoxes look reasonable?

gydlcc commented 6 years ago

hi there, at svhn part how you adapt the paths of all images to the path on your machine?

Bartzi commented 6 years ago

You could, for instance, do the following:

  1. use a text editor like sublime text, select the part of the path you want to change in the first line. Then press alt + F3 and change it to the path you want.
  2. write a python script that reads the file and changes every path
rohanbanerjee commented 5 years ago

@zolatoi @dmandreev @Bartzi Hey! I was trying out stn-ocr for a very important problem of mine. Though the paper and the implementation look very promising, I'm afraid that I'm not able to implement it after the step 4, mentioned in the readme. I'm getting an error the ctc.h file isn't found. I tried to copy the ctc.h into the mxnet/metrics/ctc and changed the line as #include "ctc.h", but there are now new issues that have come up. I cannot proceed from here, and I'm in dire need of help. Any kind of help would be much appreciated. Screenshot 2019-06-19 at 4 24 31 PM

Bartzi commented 5 years ago

Did you check my answers on issue #28? I think we should continue this discussion there.