PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://paddlepaddle.github.io/PaddleOCR/
Apache License 2.0
44.05k stars 7.81k forks source link

有个问题问一下,模型微调时, 只单独增加需要微调的图片,不加入其他不需要调的图片 这样有没有什么问题 #10609

Closed nissansz closed 5 months ago

nissansz commented 1 year ago

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

有个问题问一下,模型微调时, 只单独增加需要微调的图片,不加入其他不需要调的图片

这样有没有什么问题

ToddBear commented 1 year ago

可以参考PPOCR的微调文档:https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/doc/doc_ch/finetune.md

如果是检测任务,建议收集至少500个样本用于微调;

如果是识别任务,建议收集至少5000个样本用于微调

nissansz commented 1 year ago

要不要将之前的几百万数据加进去训练

Longleaves commented 1 year ago

同问,请问解决了吗

ToddBear commented 1 year ago

微调的话应该是不用把先前的数据加进去训练的

UserWangZz commented 5 months ago

该issue长时间未更新,暂时将issue关闭,如果问题仍然存在,可尝试重新开启