PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://paddlepaddle.github.io/PaddleOCR/
Apache License 2.0
44.05k stars 7.81k forks source link

请问这些小语种模型在哪下载? #4415

Closed Dandelion111 closed 5 months ago

Dandelion111 commented 3 years ago
9fa8c54f9a8986797a907a6d7605e31
littletomatodonkey commented 3 years ago

https://github.com/PaddlePaddle/PaddleOCR/blob/release%2F2.3/doc/doc_ch/models_list.md#%E5%A4%9A%E8%AF%AD%E8%A8%80%E8%AF%86%E5%88%AB%E6%A8%A1%E5%9E%8B

Dandelion111 commented 3 years ago

https://github.com/PaddlePaddle/PaddleOCR/blob/release%2F2.3/doc/doc_ch/models_list.md#%E5%A4%9A%E8%AF%AD%E8%A8%80%E8%AF%86%E5%88%AB%E6%A8%A1%E5%9E%8B

你好,这里面没有西班牙语? image 是不是没有提供训练好的西班牙语模型,支持80种语言是啥意思?是训练了80种语言的模型吗?

tink2123 commented 3 years ago

多语言模型是按语系训练的,字母相同的语言共享一个字典。支持80种语言就是可以预测80个语种,推荐使用whl包的方式测试:

pip install "paddleocr>=2.0.6"
paddleocr --image_dir /your/img/path --lang=es

如果希望自行训练或用源码预测的话,西班牙语属于拉丁语系,对应的 配置文件: rec_latin_lite_train.yml 训练模型: https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/latin_ppocr_mobile_v2.0_rec_train.tar 推理模型: https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/latin_ppocr_mobile_v2.0_rec_infer.tar

Dandelion111 commented 3 years ago

多语言模型是按语系训练的,字母相同的语言共享一个字典。支持80种语言就是可以预测80个语种,推荐使用whl包的方式测试:

pip install "paddleocr>=2.0.6"
paddleocr --image_dir /your/img/path --lang=es

如果希望自行训练或用源码预测的话,西班牙语属于拉丁语系,对应的 配置文件: rec_latin_lite_train.yml 训练模型: https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/latin_ppocr_mobile_v2.0_rec_train.tar 推理模型: https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/latin_ppocr_mobile_v2.0_rec_infer.tar

好的,非常感谢大佬