Belval / TextRecognitionDataGenerator

A synthetic data generator for text recognition
MIT License
3.24k stars 966 forks source link

Add Support for Diacritical Mark, and fix stroke being crop #341

Open samx81 opened 4 months ago

samx81 commented 4 months ago

Before:

Text: 佇咧麵擔仔的kha̋ng-páng頂懸

wrong_dia

wrong and ugly cut off under "麵擔仔的" wrong_stroke

After:

correct_dia

samx81 commented 4 months ago

I realize that word_split options can solve diacritical mark problem too, however stroke cropping problem can still be solved by this PR.