githubharald / SimpleHTR

Handwritten Text Recognition (HTR) system implemented with TensorFlow.
https://towardsdatascience.com/2326a3487cd5
MIT License
1.96k stars 885 forks source link

LSTM layers line context vs word context? #145

Closed sophiegonzalez3 closed 2 years ago

sophiegonzalez3 commented 2 years ago

Hello and thank you for providing all this work 😊

It is a bit of a theoretical question but when it comes to read lines ; from my understanding your model LSTM Layers take the whole line as input (up to 100 char), so they learn the context at the line level without ever learning word concepts. Right ?

Do you think that using the word model as an input for second LSTM model to reconstruct “meaningful” line from (potentially defective) predicted words would yield better results when train in a specific context (for instance medical journals )?

githubharald commented 2 years ago

The only thing that differs between line and word mode is that line mode learns to predict whitespaces between words.