PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
Apache License 2.0
40.1k stars 7.42k forks source link

The Korean language model fails to recognize '.' because it is missing. #13147

Open GreatV opened 1 week ago

GreatV commented 1 week ago

Discussed in https://github.com/PaddlePaddle/PaddleOCR/discussions/13144

Originally posted by **junhwan-lim** June 21, 2024 The Korean language model fails to recognize '.' because it is missing.
uyeongjae commented 5 days ago

, ( ) also missed

UserWangZz commented 2 days ago

你好,可以提一个pr补充一下字典中缺失的字符,如果有对模型的需要,可以在星河社区上对现有模型进行一个微调,提供给我们吗?

uyeongjae commented 2 days ago

I think that modifying only the dictionary like this PR will affect the existing model. How do you recommend learning to recognize small numbers of missing characters?

UserWangZz commented 23 hours ago

Use existing models and new dictionaries for fine-tuning

I think that modifying only the dictionary like this PR will affect the existing model. How do you recommend learning to recognize small numbers of missing characters?