如何提升版面分析模型精度

shiyutang commented 1 year ago

在CDLA数据集上进行训练测试，效果不是很理想，有人能帮解决一下吗

我们提供了AceIssueSolver来帮助你解答问题，你是否想要它来解答(请填写yes/no)?/We provide AceIssueSolver to solve issues, do you want it? (Please write yes/no): yes

请尽量不要包含图片在问题中/Please try to not include the image in the issue.

aceissuesolver[bot] commented 1 year ago

Issue Response

✨ Introduction: Thanks for opening this issue, I am the AceIssueSolver that will help you with your questoins!
🎯 First round response: The answer to your question is:

根据提供的信息，针对版面分析模型精度提升的问题，可以尝试以下方法：

增加数据增强方式或者调大增强概率：可以尝试使用更多的数据增强方式，或者增加数据增强的概率，来增加训练集的多样性，减少过拟合的情况。
调大系统的L2 decay值：可以尝试增大系统的L2 decay值，以减少模型的复杂度，防止过拟合。

另外，可以参考PP-OCRv3这一开源超轻量OCR系统进行车牌识别系统的开发。该模型在CCPD数据集上达到了99%的检测精度和94%的识别精度，模型大小为12.8M。通过量化训练可以进一步压缩模型大小到5.8M，并提升推理速度。

对于识别模型中多预测一个特殊字符的问题，可以尝试去掉这个特殊字符，可以显著提高精度。

此外，还可以参考可视化效果，通用模型更倾向于检测一整行文字，而轻量级模型可能会将一行文字分成两段进行检测，因此并不是数量越多效果就越好。

另外，尽量保证类别均衡，如果某些类别样本较少，可以通过补充合成数据的方式处理。实验证明，训练集中出现频次较少的字符会导致识别效果较差，增加迭代次数不能解决样本量少的问题。

如果测试图像为中文，可以下载中文CDLA数据集的预训练模型，用于识别10种类型的文档区域：表格、图像、图像标题、表格标题、页眉、页脚、参考文献、公式。可以在版面分析模型中下载模型的训练和推理模型。

Related Sources

How to use

To invoke the AceIssueSolver, add a comment using one of the following commands: @Ace-issue-solver \<QUESTION>: Pose more questions on AceIssueSolver.

tangtianyi1998 commented 10 months ago

如果测试的模型中英文都有 cdla训练数据加入英文标注会改善吗？

codeman008 commented 8 months ago

我训练测试效果还可以，想对比直接提供的cdla预训练模型效果，没跑通W0222 11:42:21.103763 1932985 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 12.0, Runtime API Version: 11.8 W0222 11:42:21.105564 1932985 gpu_resources.cc:164] device: 0, cuDNN Version: 8.2. Traceback (most recent call last): File "tools/infer.py", line 237, in main() File "tools/infer.py", line 233, in main run(FLAGS, cfg) File "tools/infer.py", line 165, in run trainer.load_weights(cfg.weights) File "/home/user/zw-0219/PaddleDetection/ppdet/engine/trainer.py", line 438, in load_weights load_pretrain_weight(self.model, weights, ARSL_eval) File "/home/user/zw-0219/PaddleDetection/ppdet/utils/checkpoint.py", line 253, in load_pretrain_weight param_state_dict = match_state_dict(model_dict, param_state_dict) File "/home/user/zw-0219/PaddleDetection/ppdet/utils/checkpoint.py", line 143, in match_state_dict weight_keys = sorted(weight_state_dict.keys()) AttributeError: 'Tensor' object has no attribute 'keys'

shiyutang / Acceleratedcpp