PerceptualLoss - Githubissues

Phhofm commented 1 year ago

Thank you for your work, it seems interesting :)

I just had some questions (since I wanted to train/finetune a model):

I wanted to use PerceptualLoss but I see that you have commented out just this specific loss in the losses.py file (basicsr/losses). Is there a specific reason for that? Or in other words, can I simply comment it in and use it, or should I not use PerceptualLoss for my training? (was planning to use L1Loss, PerceptualLoss and GANLoss but could also just use L1 and GAN).
In your example configs, you have set manual_seed specifically to 10. I also wanted to ask if there was a specific reason for that, I am simply used to seeing this being set to 0 in example training configs.
Last thing is about dataset_enlarge_ratio being set to 100. As far as I understand, this is done so training considers the training dataset as larger internally so that it wont reach epochs too fast since it resets some stuff everytime an epoch is reached. I believe the default value would be 1 (which I see most often in example training configs) which would not do anything. So my question is, for training, if I should leave this at 100 or can I set it to 1?

Would be thankful for answers :)

zhengchen1999 commented 1 year ago

Hi. Thanks for your interest in our work.

The code of this project is based on BasicSR. The losses.py file comes directly from BasicSR. Commenting out PerceptualLoss is only due to some historical issues and does not affect the operation and results of the code. (I have uncommented it and added some necessary files). DAT does not use PerceptualLoss during training. But this does not affect that you train DAT with PerceptualLoss. In fact, the code supports the use of PerceptualLoss. But I haven't tried it. For more information, you can refer to BasicSR. Regarding this part of the implementation (code for all loss functions), we exactly refer to it.
The value of manual_seed is set randomly. In the case of manual_seed=10, you can fully reproduce the results of this paper. I haven't tried other values of manual_seed. But I think it isn't an important parameter and has little effect on the result.
For dataset_enlarge_ratio=1/100. You are right. The value of dataset_enlarge_ratio does not affect the result. The large dataset_enlarge_ratio only reduces the file read and write time on some servers with slow I/O speed. I have updated dataset_enlarge_ratio=1 in all training YML files.

Thank you for your valuable questions. And if you have any other problem, please let us know. Thanks.

Phhofm commented 1 year ago

Thank you, I was able to make a finetune on your official DAT_x4 model. Used AdamW with L1Loss, PercetualLoss, ColorLoss and GanLoss together with (a little bit of) otf jpg compression, blur and resize.

Examples: Imgsli1 (generated with onnx file) Imgsli2 (generated with onnx file) Imgsli (generated with testscript on the three test images in dataset/single with pth file)

Model files (pth file, onnx conversions, model information, and my failed attempts) can be found in this google drive folder it someone wanted to try it out.

For convenience the direct file links: Download pth file (~295MB) Download onnx file (~85.8MB)

Phhofm commented 1 year ago

PS I wanted to show another DAT finetune I trained (and had just released) on the FFHQ (Flickr-Faces-HQ) dataset, for 4x upscaling faces:

Model Name: 4xFFHQDAT

Examples: Imgsli1 Imgsli2 Imgsli3 Imgsli4 Imgsli5 Imgsli6 Imgsli7

Download pth file (~295MB) Download fp32 onnx file (~85.8MB)

And I also made a variant of it that can handle low quality input:

Model Name: 4xFFHQLDAT

Examples: Imgsli1 Imgsli2 Imgsli3

Download pth file (~295MB) Download fp32 onnx (~85.8MB)

eisneim commented 1 year ago

@Phhofm thank you for sharing those onnx files! the results are looking pretty good!

shiyuleixia commented 1 year ago

PS I wanted to show another DAT finetune I trained (and had just released) on the FFHQ (Flickr-Faces-HQ) dataset, for 4x upscaling faces:

Model Name: 4xFFHQDAT

Examples: Imgsli1 Imgsli2 Imgsli3 Imgsli4 Imgsli5 Imgsli6 Imgsli7

Download pth file (~295MB) Download fp32 onnx file (~85.8MB)

And I also made a variant of it that can handle low quality input:

Model Name: 4xFFHQLDAT

Examples: Imgsli1 Imgsli2 Imgsli3

Download pth file (~295MB) Download fp32 onnx (~85.8MB)

@Phhofm Hi,I wonder how to convert this model from pth to onnx,TKS

zhengchen1999 / DAT

PerceptualLoss #3