FangShancheng / ABINet

Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition
Other
420 stars 72 forks source link

Detection of upper case letters and punctuation #70

Open sjscotti opened 2 years ago

sjscotti commented 2 years ago

Hi! I’m very impressed that ABINet’s ability to recognize words. I am starting to use it to help identify words in old newspapers when Tesseract cannot accurately identify a word and returns results that have low confidence values. However, I’ve noticed that it ignores capitalization in words and punctuation. Is there a setting that can be used so that they can be included in the recognized text? Thanks!

FangShancheng commented 2 years ago

You should change the dictionaries (i.e., use [charset_62.txt] https://github.com/FangShancheng/ABINet/blob/main/data/charset_62.txt) rather than charset_36.txt) and retrain the models.