tmbdev / clstm

A small C++ implementation of LSTM networks, focused on OCR.
Apache License 2.0
821 stars 224 forks source link

2D LSTM ? #85

Closed seragENTp closed 8 years ago

seragENTp commented 8 years ago

what is the architecture of the 2D LSTM implemented in the library , any reference for it ?

tmbdev commented 8 years ago

It's a bidirectional LSTM running over the rows of the image, then over the columns. The input is usually the output from a convolutional layer.

There is a bit more info and references in this paper:

http://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Byeon_Scene_Labeling_With_2015_CVPR_paper.html

amitdo commented 8 years ago

Tom, I remember that you wrote some time ago that 2D LSTM is not better than 1D LSTM for OCR of printed text. Is that still true? for all scripts?

tmbdev commented 8 years ago

Getting good performance out of 1D LSTM requires a good normalizer. The Ocropus normalizer works surprisingly well for some non-Latin scripts, but we really need more benchmarks to see how far that carries over.

The normalizer is a fairly tricky piece of code, so it would be nice to be able to dispense with it. I'll be experimenting with once the basic GPU implementation is done.

ASDen commented 8 years ago

@tmbdev what you are describing and is implemented in CLSTM is essentially a ReNet [1] style LSTM but what is in the paper [2] is the 2D case of the classical MDLSTM [3] {by graves et al.} with two forget gates


[1] https://arxiv.org/abs/1505.00393 [2] http://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Byeon_Scene_Labeling_With_2015_CVPR_paper.html [3] https://arxiv.org/abs/0705.2011

tmbdev commented 8 years ago

Correct, ReNet implements the same model we do, and both models are different from the original 2D LSTM. We published some additional papers in 2014, and I gave some tutorials on these kinds of multidimensional LSTMs in 2013.

morusu commented 7 years ago

hi, where is the 2D lstm ? I cannot find the implementation , did i miss someting?

amitdo commented 7 years ago

https://github.com/tmbdev/clstm/blob/master/test-2d.cc

amitdo commented 7 years ago

... and https://github.com/tmbdev/clstm/blob/509144def09c/clstm_prefab.cc#L130