JeongHyunJin / Pix2PixHD

Pix2PixHD codes for scientific data analysis
35 stars 9 forks source link

question: how come the results are so close to real images pixel-wise even without reconstruction loss? #5

Open jxchen01 opened 2 years ago

jxchen01 commented 2 years ago

I am testing this code on my own data. But, i find the the generated image super realistic (so, very good in this sense). the generated images are just looking realistic, but very different from the real image pixel-wise. For example, in Figure 4 of your paper (Generation of High-resolution Solar Pseudo-magnetograms from Ca II K Images by Deep Learning), the generated HMI image is almost identical to the real HMI images. But, when training on my own data, I can never get such pixel-wise similarity. When reading your paper in depth, and the original pix2pixHD paper, I find that there is no reconstruction loss as in the original pix2pix paper (e.g., L_L1 (G) ). The reconstruction loss encourages the generated image to be as close as possible to the real image, pixel-wise. Do you have any idea how to understand this?

jxchen01 commented 2 years ago

hmm ... actually, i think the "condition" image for the discriminator could help in terms of the pixel-wise correspondence. but, i still don't understand why my model generate very realistic images, but with very poor pixel-wise correspondence. Any suggestion?

JeongHyunJin commented 2 years ago

Pix2PixHD model use a Feature Matching (FM) loss, which is an objective function to optimize the parameters of the generator. The discriminator consists of several convolution layers and each convolution layer generates a feature map based on input. The FM loss is to minimize the 'absolute difference' between the feature maps of the real and generated pair from multiple layers of the discriminator. Based on my experience and several references, it is more effective for a large dynamic range data than the loss function derived from the absolute difference between the target and generated data directly. Please refer to https://arxiv.org/abs/2204.12068 (Deep Learning Model) for explanation of the loss functions and an improved model.

And... In your case, is the overall distribution of features in your AI-generated data reconstucted well, but is it that each pixel value is not resonable? If my understanding is correct, I expect that there was a problem with the normalization of the dataset. When you train and test the model, you should normalize to fixed max & min values, not max & min values for each image. When each data pairs are normalized to different ranges, it will be difficult to reconstruct exact values.

Please check them out and let me know again.