gfxdisp / mdf

Multi-scale discriminator feature-wise loss function
BSD 3-Clause "New" or "Revised" License
104 stars 8 forks source link

how to define the generator #1

Closed KKKloveQbh closed 3 years ago

KKKloveQbh commented 3 years ago

Dear sir, after I read the code, there are a few questions. How to train the generator mentioned in the paper? And how to define the generator mentioned in this paper?

aamir-mustafa commented 3 years ago

Hi. Thanks for you interest in our work. The generator architecture is the same as in the SinGAN (ICCV 2019) paper.

For using our saved generators for training your own networks, you don't have to explicitly define the generator. You can straight away load the saved checkpoints and it should work.

Hope that helps

zhanghm1995 commented 2 years ago

@aamir-mustafa , Hello, thank your for sharing this work.

I want to know could I just use this loss function directly to replace my VGG perceptual loss in my human face generation task? Or I should do some adjustment?

Hope for your reply

aamir-mustafa commented 2 years ago

Hi, Thanks for the message. The method as such is a loss function designed for targeted towards task-specific distortions, like JPEG distortions, SISR and denoising. However, you can use the MDF loss for SISR for generic tasks as well, since it is trained on muti-scales of a natural image, without addition of any additional distortions apart from upsampling. Even though the method was not proposed for the task of face generation, I would be interested in its performance on such a task. Please let me know if you have any other questions. Best Aamir

zhanghm1995 commented 2 years ago

Hi, Thank you very much for replying me so quickly. I will try your MDF loss in the face generation and face image inpainting task. I have used the MultiScaleDiscriminator from the Pix2Pix repo in my task right now, which consists of PatchGAN architecture, but got unsatisfying face details(such as beard, tooth) image.

Since I found the visualization results of MDF's were so amazing that I want to adapt it in my task as well.

Thanks a lot again.

shengyenlin commented 1 year ago

Hi,

Thanks for the amazing work

I would also like to know is the discriminator model the same as that in SinGAN?

aamir-mustafa commented 1 year ago

Hi,

Thanks for your interest in our work. As is mentioned in the paper, the discriminator structure is the same as SinGAN with differences in training scheme. Since, it is a task specific loss function, the discriminator is trained differently for different types of distortions. e.g. for jpeg artefact removal, at every stage of upscaling (i.e. at the different multiple scales), we compress the seed image with JPEG compression. Hope that helps.

Best Aamir

shengyenlin commented 1 year ago

Hi,

I'm really grateful for your prompt reply.

Could you explain more clearly the paragraph right before section 4? (i.e. A subtle but crucial aspect of our loss is that the discriminators are not applied to the scales on which they were trained. If the seed image has dimensions Hy × Wy, the training input (both seed and synthetic) to the discrimina- tor Dk will have dimensions Hy/ρ(K−k) × Wy/ρ(K−k). However, the input to the discriminator during phase 2 of training will not be scaled and it will be Hi × Wi, the size of the xi.)

I'm not sure how the original size picture could be fed into discriminators at different scales.

Plus, is it possible to directly replace this loss function to other non-CNN models (such as transformer-based CV models).

Finally, how could I train my own discriminators for MDF on another dataset of image de-noising, or the discriminator you provided is well-trained enough on any de-noising datset.

Thanks in advance

aamir-mustafa commented 1 year ago

Hi,

Please note that the paper proposes two phases of training: 1. Training the MDF discriminators for a particular task and 2. Using the pretrained and frozen multi scale discriminators as feature extractors. In the second phase of training, we do not propose any method/network but rather compare the performance of our loss function with some already existing methods. To this end, the discriminators should act as feature extractors for a given image xi. Since, the discriminators at every scale are patch based, it is possible to pass any dimensional image through them.

Regarding, non CNN based model, I am afraid we have not tested the performance for such models in our work.

Regarding, retraining the MDF loss function, you need to make sure the model is task specific rather than one that learns the manifold of natural images (the entire idea behind the paper). This now depends on what type of noise are you concerned with? If it is a generic denoising model, you can follow the training scheme of the original SinGAN paper by adding noise at every scale of training. However, you may need to increase the magnitude of noise added. Once, it is trained, you can freeze the discriminators for feature extraction.

Hope that answers the question.

shengyenlin commented 1 year ago

Hi,

Thanks for your detailed explanation. It's clear for me now the result of your work.

But could you be more specific on the difference of your training scheme and that of SinGAN in the context of image de-noising? You mentioned that I should increase the magnitude of noise added but how much additional noise should I add?

I'm trying to develop an image de-noising model for 3D images, and wish to re-train similar discriminators.

Thanks :)