Open lunactic opened 7 years ago
I am not aware of that anyone is working on this at the moment. But, I think, Tom outlined how this could be work again in https://github.com/tmbdev/ocropy/pull/25#issuecomment-72075445 . Is this the same issue?
The problems you face when you only search in Issues and not in PRs :-D
But yes, basically it is the same problem.
If you want to use a language model in cooperation with the recognition output you need to have the recognition-lattices to combine the possibilities of the language model with the possibilities of the recognizer.
It seems this would require quite some work though and I don't know how high up on your todo list something like this would be,
The old code is still there https://github.com/tmbdev/ocropy/blob/master/OLD/ocropus-lattices , but I have no idea what would be needed to adapt or if one should write it new from scratch. It sounds nice to have such a lattice recognizer in combination with language modeling, but this is, I am afraid, not on my todo list. I don't understand enough about these neural networks to even dare to start here something ;-) Maybe, this could be suitable for some student work?
Is there a plan to bring back some form of implementation of the ocropus-lattices tool?
It would be great to have the possiblity to extract the recognition-lattices to combine them with an additional language model to possibly improve recognition results.