ming053l / DRCT

Accepted by New Trends in Image Restoration and Enhancement workshop (NTIRE), in conjunction with CVPR 2024.
MIT License
195 stars 16 forks source link

fp16 inference? #14

Open zelenooki87 opened 5 months ago

zelenooki87 commented 5 months ago

Thank you for wonderful project. Does model run with fp32 or with fp16 by default? Could we force DRCT REAL Gan to use fp16 for inference?

zelenooki87 commented 4 months ago

@ming053l Just to be sure you stated in readme that Real DRCT GAN SRx4. (Coming Soon)

and bellow that Real-DRCT-GAN_Finetuned from MSE

Is Real DRCT GAN SRx4 fully trained or we can expect more advanced model? Thanks

ming053l commented 4 months ago

@zelenooki87

hi! sorry for lately reply, we just come back from cvpr24.

Actually, I haven't tried the Real-DRCT-GAN yet because, as you know, last week was a busy one. After training, I just uploaded the model directly. I will conduct a simple analysis of Real-DRCT-GAN in the coming days and upload the results to the repository. However, it will take some time and won't be completed this week.

We default to using fp32, and I haven't tried fp16 inference yet.

We are currently developing DRCT-v2 (an advanced version of DRCT). The design of DRCT is relatively uncomplicated, and our anticipated direction for development is to achieve better performance based on the same principles. We have already achieved some initial good results, so please look forward to future updates. Please looking forward!

zelenooki87 commented 4 months ago

Thank you for answer. I did little research, converted real gan model to fp16 onnx and optimized it https://mega.nz/file/J4QQVSQA#ihEL_lxQhblpGZ3OFO-XjrCMxt0mFphIOFcRRrP3w68 created some code for inference real gan onnx model: https://pastebin.com/ArHEgBmR

I am geting 2.5-3x speeds improvements over original pytorch code. (RTX 3090 24GB Vram) I think that you could add fp16 optimizations to code, like it is possible in swinir? https://github.com/JingyunLiang/SwinIR/issues/114

Aside from that, could you please tell me, which technique you think is the best for glueing tiles? I tried implement poisson blending but result are not always perfect.

Thank you very much.

ming053l commented 4 months ago

@zelenooki87 Hi!

Thank you for sharing your progress and the links! I'll add FP16 optimizations in a few days. In addition, we have developed DRCT-v2 and its parameter-size is almost 60% of DRCT-v1. At the same time, keeping the balance between performance and speed. We are preparing the document and do some experiment (cooking now...XD)

Regarding tiles blending, is your purpose to accelerate inference speed and reduce artifacts as much as possible? To my best knowledge, poisson blending is common technique to use. If your want to speed up inference time, linear blending may be a good choice. Or you can try gaussian blending? I am not familar with it, so I think I can't give you a good suggestion.

zelenooki87 commented 4 months ago

@ming053l How is the development of drc v2 progressing? When can we expect you to add fp16 inference to the code? Also, can you further train the Real GAN model? Although it's faster than Real HAT GAN and SwinIR L GAN, the version you have seems to not be trained on a sufficient amount of datasets. (A lot of details are lost in the output image compared to SwinIR, for example). For comparison, SwinIR Large was trained using SwinIR-L (large size): DIV2K + Flickr2K + OST + WED(4744 images) + FFHQ (first 2000 images, face) + Manga109 (manga) + SCUT-CTW1500 (first 100 training images, texts). Thank you.

ming053l commented 4 months ago

@zelenooki87 hi, we have developed the DRCTv2, we are preparing the article for publishment and it may need some time.

Due to GPU limitations, we cannot train many models or conduct many experiments at one time. We will choose relatively important experiments to do. In terms of selection, we will prefer content that has been completed in DRCTv2 because its potential is greater than that of DRCT. As for Real-DRCT-GAN, it should not be launched in the next two weeks due to the above reasons. Thank you for the suggestions you provided us, because I personally did not have a lot of experience in training Real-SR-GAN before, and I did not consider that the training data was not comprehensive. We will use your suggestions to fine-tune Real-DRCT-GAN in the future!

image

Training process of DRCTv2 (Blue line is version2 and red line is version1)

I will add fp16 inference in these 2 day!

zelenooki87 commented 4 months ago

@ming053l Thank you for including my script! :) It had a minor bug causing blurry output, but I've fixed it (same link, edited pastebin - https://pastebin.com/ArHEgBmR ). The output files are now as they should be. I apologize for the oversight. Excited for DRCT v2! Best of luck with development.

zelenooki87 commented 3 months ago

@ming053l Hi. Can you give me an update on the drct v2 project? Is there any news? When can we expect new models? Thank you so much!