Closed rkcosmos closed 4 years ago
I didn't know EasyOCR used my project, I am very glad that you got some use out of it!
As for the dictionaries that's great! Thank you!
@rkcosmos Can you please explain if you made any changes to the code of trdg
to get it working properly for all non-Latin-based langauges?
Also what are the fonts that you used for each language? (if any)
It'd be really helpful if you can briefly mention about that ЁЯЩВ
@GokulNC you can find font for pretty much all languages on Google Fonts: https://fonts.google.com/
Is there a particular language that you are having trouble with?
@Belval Yes. For example, I am trying to generate data for Hindi (Devanagari script).
This is the text:
рджреЗрд╡рддрд╛рддреНрдореЛрдВ рдирддрд┐ рдЬреЛрд╣рд╛рд░реЛ рдкреЛрд╣рдирд╛ рдорд╛рд▓рд┐рдВрдХреЗ
And this is the output:
My environment:
sudo apt install fonts-indic
)RAQM
installed
libraqm
, I had to rebuild Pillow
Any pointers on how to fix would be great!
Edit:
I checked the source code, and seems like I had to enable the --word_split
flag. And it worked after that. ЁЯСН
Please mention in the README that we have to enable that for Abugida scripts (like Indic languages).
Thanks.
Hi @GokulNC
This does solve the complete problem. There is an issue in Hindi Matra (рдорд╛рддреНрд░рд╛). It gets displaced for many words. See examples
Label: рд╕рд┐рд▓рд┐рдПрдЯ
Label: рдЦрд░рд┐рджрд╡рд╛рдПрдБрдЧреАрдВ
See the change in what is there in the label as compared to in the image. I have tried different Devanagari fonts but the issue still persists.
But when I try the same word with same font on google, it works fine (see here link)
Any idea @Belval @GokulNC why this is happening? Thanks
Update: libraqm solves this issue. ЁЯЩВ
Hello,
First of all, thanks a lot for creating this text generator. We used it for EasyOCR project. I think it's time we give a small contribution back to your project. This PR contains a lot of dictionary for 50+ languages. It's not only my effort but rather a community's work to create OCR system that works for their language.
Thanks a lot, Rakpong