question about training dataset.

cactusgame commented 3 weeks ago

Hi, thanks for your great paper, I have a question when I re-produce your work in the paper.

As mentioned in the paper that

Owing to the lightweight structure of our backbone, DR-Net takes the original source image as input without down-sampling or cropping, making our enhanced results free from splicing traces. For Is with resolutions below 512, we resized the short side to 512 while maintaining the aspect ratio

Do you mean that the pre-trained DRNet is trained only by the RealDAE dataset? You mentioned the batch size is 16 during the training, without the data cropping, so there are only 600/16 = 38 iterations an epoch to train the DRNet?

Could you tell me how many epochs you need to train the pre-trained DRNet? Currently, I find that it's hard to reproduce your result only with the RealDAE dataset. Or am I misunderstanding something?

ZZZHANG-jx commented 3 weeks ago

Hi, thanks for your great paper, I have a question when I re-produce your work in the paper.

As mentioned in the paper that

Owing to the lightweight structure of our backbone, DR-Net takes the original source image as input without down-sampling or cropping, making our enhanced results free from splicing traces. For Is with resolutions below 512, we resized the short side to 512 while maintaining the aspect ratio

Do you mean that the pre-trained DRNet is trained only by the RealDAE dataset? You mentioned the batch size is 16 during the training, without the data cropping, so there are only 600/16 = 38 iterations an epoch to train the DRNet?

Could you tell me how many epochs you need to train the pre-trained DRNet? Currently, I find that it's hard to reproduce your result only with the RealDAE dataset. Or am I misunderstanding something?

Thank you for your interest in our work!

Training directly with real images can indeed be challenging in terms of stability, as manually labeled data often carries a degree of subjectivity, leading to inconsistent annotations. We encountered similar issues in our research, which led us to adopt a progressive learning approach, initially focusing on data with more consistent annotations (such as those without colored figures).

To further enhance training stability, you might consider using synthetic data for pre-training. In our latest work, we found that training solely with synthetic data not only converges more quickly but also yields quite effective results. You can refer to the following link to construct training data based on Doc3DShade and train DRNet and GCNet: https://github.com/ZZZHANG-jx/DocRes/tree/master/data#appearance-enhancement

cactusgame commented 1 week ago

Thanks for your suggestion, I'll have a try, thanks!

To further enhance training stability, you might consider using synthetic data for pre-training. In our latest work, we found that training solely with synthetic data not only converges more quickly but also yields quite effective results. You can refer to the following link to construct training data based on Doc3DShade and train DRNet and GCNet: https://github.com/ZZZHANG-jx/DocRes/tree/master/data#appearance-enhancement

cactusgame commented 1 week ago

Hi, following the document mentioned, I synthesized 10,000 pairs of training images using Doc3dShade, and trained both gcnet and drnet for 50 epochs each. As you said, the training converges very quickly, but the evaluation metrics are unsatisfactory. Could you provide suggestions on achieving better training results for GCDRNet?

Additionally, I understand that DocRes performs better, but I am looking for a lighter model. If you have any better recommendations regarding lightweight models, I would be very grateful to hear them. Thank you very much.

The picture below is the validation metrics for DRNet.

cactusgame commented 1 week ago

list some validation results here.

The 1st column is the input degraded image
The 2nd column is the ground truth image
The 3rd column is the prediction by GCDRNet

cactusgame commented 1 week ago

Another issue is that some images exhibit a "dirty" effect at the edges, similar to the one shown below. Do you happen to know how to address this kind of issue?

ZZZHANG-jx commented 1 week ago

Hi, following the document mentioned, I synthesized 10,000 pairs of training images using Doc3dShade, and trained both gcnet and drnet for 50 epochs each. As you said, the training converges very quickly, but the evaluation metrics are unsatisfactory. Could you provide suggestions on achieving better training results for GCDRNet?

Additionally, I understand that DocRes performs better, but I am looking for a lighter model. If you have any better recommendations regarding lightweight models, I would be very grateful to hear them. Thank you very much.

The picture below is the validation metrics for DRNet.

From your results, it seems that there is an issue with poor color information recovery, likely due to a lack of similar data. I suggest adding some PDF data similar to magazines that contain more figures to help the model learn color recovery better. Other suggestions include: first, train GCNet, then fix GCNet and train DRNet (consider adding real data at this stage), and finally, consider joint training of GCNet and DRNet with a reduced learning rate. You can start with the simplest L1 loss and single scale.

cactusgame commented 1 week ago

Thank you for your suggestions; they are accommodating!

I have one last question. You mentioned that "You can start with the simplest L1 loss and single scale" Is it okay to only introduce multi-scale training in the final stage (joint training of GCNet and DRNet)? Do you mean that the excessive shadows appearing at the edges, as shown in the image above, are likely due to a lack of multi-scale training?

From your results, it seems that there is an issue with poor color information recovery, likely due to a lack of similar data. I suggest adding some PDF data similar to magazines that contain more figures to help the model learn color recovery better. Other suggestions include: first, train GCNet, then fix GCNet and train DRNet (consider adding real data at this stage), and finally, consider joint training of GCNet and DRNet with a reduced learning rate. You can start with the simplest L1 loss and single scale.

ZZZHANG-jx commented 3 days ago

Thank you for your suggestions; they are accommodating!

I have one last question. You mentioned that "You can start with the simplest L1 loss and single scale" Is it okay to only introduce multi-scale training in the final stage (joint training of GCNet and DRNet)? Do you mean that the excessive shadows appearing at the edges, as shown in the image above, are likely due to a lack of multi-scale training?

From your results, it seems that there is an issue with poor color information recovery, likely due to a lack of similar data. I suggest adding some PDF data similar to magazines that contain more figures to help the model learn color recovery better. Other suggestions include: first, train GCNet, then fix GCNet and train DRNet (consider adding real data at this stage), and finally, consider joint training of GCNet and DRNet with a reduced learning rate. You can start with the simplest L1 loss and single scale.

It is okay to introduce multi-scale training only in the final stage. Even with multi-scale training, the issue you mentioned could still occur, though multi-scale training might help mitigate the problem.

ZZZHANG-jx / GCDRNet

question about training dataset. #3