The issue of poor performance in the deblurring task with a custom dataset.

ziFan99 commented 3 weeks ago

I would like to ask about my current task, which involves recognizing blurred shipping labels. The dataset I am using is self-generated using a blur generation algorithm on clear shipping labels. However, after training with your model, the results have not been satisfactory. I would like to know if your model is suitable for such a scenario (i.e., document blur, where details such as text and numbers on the shipping labels are missing), or could it be that my dataset is too small? Could you also share what the dataset size was in your deblurring tasks? I look forward to your reply.

Royalvice commented 3 weeks ago

I think it should be that the data set you generated is not aligned with the blurring method of the blurry image you used for inference. I use 50000 samples for training

------------------ 原始邮件 ------------------ 发件人: "Royalvice/DocDiff" @.>; 发送时间: 2024年10月21日(星期一) 中午11:04 @.>; @.***>; 主题: [Royalvice/DocDiff] The issue of poor performance in the deblurring task with a custom dataset. (Issue #40)

I would like to ask about my current task, which involves recognizing blurred shipping labels. The dataset I am using is self-generated using a blur generation algorithm on clear shipping labels. However, after training with your model, the results have not been satisfactory. I would like to know if your model is suitable for such a scenario (i.e., document blur, where details such as text and numbers on the shipping labels are missing), or could it be that my dataset is too small? Could you also share what the dataset size was in your deblurring tasks? I look forward to your reply.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>

ziFan99 commented 3 weeks ago

Hello, you're absolutely right. I used a blur generation method to create the training dataset, but for inference, I used real blurred images. I also believe that the blur kernel of the real blurred images is different from the one used in the training dataset, which is why the model performs poorly in real-world blurred scenarios. I would like to ask if you have any solutions for this issue. Are the training datasets you used collected from real-world blurred images? How do you align them with the clear images? I look forward to your reply. Thank you！

Royalvice / DocDiff

The issue of poor performance in the deblurring task with a custom dataset. #40