我在做ppcor的文字识别时，使用svtr的配置文件，自己的古籍方面的数据集，码表包含了里面的字符，训练很长时间，acc=0.04，推理只有少数文本行可以进行识别

PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

https://paddlepaddle.github.io/PaddleOCR/

Apache License 2.0

42.44k stars 7.66k forks source link

我在做ppcor的文字识别时，使用svtr的配置文件，自己的古籍方面的数据集，码表包含了里面的字符，训练很长时间，acc=0.04，推理只有少数文本行可以进行识别 #9082

Closed ic1031 closed 1 year ago

tink2123 commented 1 year ago

SVTR限制训练样本的最大长度，检查下是否训练样本过长导致大部分样本没有被有效训练，或被过度压缩影响特征提取。建议单行样本字数不要超过20。

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

nissansz commented 3 months ago

SVTR限制训练样本的最大长度，检查下是否训练样本过长导致大部分样本没有被有效训练，或被过度压缩影响特征提取。建议单行样本字数不要超过20。

哪个svtr 配置文件速度最快，准确率最高？