Training Set For This Model

raoyutian / PaddleOCRSharp

PaddleOCRSarp是一个基于百度飞桨PaddleOCR的C++代码修改并封装的.NET的OCR工具类库。包含文本识别、文本检测、表格识别功能。本项目针对小图识别不准的情况下做了优化，比飞桨原代码识别准确率有所提高。包含总模型仅8.6M的超轻量级中文OCR，单模型支持中英文数字组合识别、竖排文本识别、长文本识别。同时支持多种文本检测。

Apache License 2.0

617 stars 100 forks source link

Training Set For This Model #27

Closed leviethung2103 closed 1 year ago

leviethung2103 commented 1 year ago

Hello,

I am very interesting about the project. In comparsion with the orignal PaddleOCR, I've seen that the performance of your model is so good when detecting the small obejcts.

Could you please share with me what you've changed in the model architecture and the training dataset you've used to train the model?

I am juts curious aabout the techniques. Thank you in advance.

raoyutian commented 1 year ago

请用中文描述，谢谢

leviethung2103 commented 1 year ago

你好，

我对这个项目非常感兴趣。与原始的PaddleOCR相比，我注意到你的模型在检测小物体时性能非常好。

请问你能和我分享一下在模型架构方面和训练数据集方面做了哪些改进来训练这个模型吗？

我对其中的技术很好奇。非常感谢你提前的回答。

raoyutian commented 1 year ago

网络不好github经常无法打开。我可以问下您是来自哪个国家吗？你使用的是哪个模型？