bgshih / crnn

Convolutional Recurrent Neural Network (CRNN) for image-based sequence recognition.
MIT License
2.05k stars 550 forks source link

Add STN to crnn #38

Open hn18001 opened 7 years ago

hn18001 commented 7 years ago

@bgshih crnn is a very great project, thanks for your open. Have you ever added stn to crnn? I tried to add the stn layer(https://github.com/qassemoquab/stnbhwd) to crnn, but the train loss is always very big, I've already set the transform matrix to identity matrix, but it looks like the stn layer learns nothing in the training procedure, should I try sgd optimization method instead of adadelta or others?

rremani commented 7 years ago

That's a nice idea actually @hn18001 , ill try to add too, have you gone through this http://torch.ch/blog/2015/09/07/spatial_transformers.html, they have added stn layer for recognising traffic signs.

bgshih commented 7 years ago

@hn18001 @rremani We have another paper that has done that, but on attention-based generator rather than CRNN. But I believe the same idea would work on CRNN.

In our experience, adding STN makes the network much harder to train. Also, identity initialization doesn't work for us. We used a slightly disturbed initialization (see the paper) to encourage STN to be optimized.