Evezerest / PPOCRLabel

PPOCRLabel is a semi-automatic graphic annotation tool suitable for OCR field, with built-in PP-OCR model to automatically detect and re-recognize data. It is written in Python 3 and PyQT5, supporting rectangular box annotation and four-point annotation modes. Annotations can be directly used for the training of PP-OCR detection and recognition models.
175 stars 41 forks source link

Annotate with another language #41

Closed congphu2511995 closed 2 years ago

congphu2511995 commented 2 years ago

Can i use another language for text annotation (for example vietnamese)? Thank you for this useful tool.

Evezerest commented 2 years ago

Yes, PPOCRLabel have a build-in OCR engine paddleocr, which has already supported the Vietnamese recognition, you can specify the lang parameter to vi here, more languages and abbreviations supported by paddleocr can be found here

However, we didn't use much data when training multi-language models. If you are interested in improving the multi-language model, pls let me know, we are happy to give you some advice.