phillipi / pix2pix

Image-to-image translation with conditional adversarial nets
https://phillipi.github.io/pix2pix/
Other
10.14k stars 1.71k forks source link

Occasional appearance of persistent artifacts #52

Open Quasimondo opened 7 years ago

Quasimondo commented 7 years ago

In particular when training for a long time it can happen that some very annoying artifacts appear on some of the outputs. Usually they are square, are at the same location and look almost identical on different outputs. Once they appear they rarely go away even with longer training.

Here are some examples:

screenshot from 2017-02-16 11-57-08 screenshot from 2017-02-16 11-57-39 screenshot from 2017-02-16 11-58-12 screenshot from 2017-02-16 11-58-37 screenshot from 2017-02-16 11-56-45

I wonder what causes them? Division by zero in a layer? An overflow? And of course it would be great if there was a way to remove them without having to start training from 0. I wonder if there is a way to at least identify from which layer they originate.

Assuming that this is caused by a single ill-defined weight or bias in a relatively deep layer I wonder if there is a mathematical way I could identify that cell and then for example just replace that value with zero or the mean? For example this artifact has always a size of 48x48 pixels, the input/output is 256x256 - I guess by adding/multiplying the kernel sizes of the convolutions one could figure out at which point the accumulated kernel size becomes 48x48 and then backtrack from there?

Quasimondo commented 7 years ago

I've looked a bit deeper into the outputs of the convolutional layers and whenever the artifact appears there is indeed a single value outside of the typical numeric range which appears as a white pixel in the normalized output tensor:

14

Since it appears in several cells I assume that the error is in one of the two convolutional layers before it, but visually it is hard to tell on those:

13

12

kdmojdehi commented 7 years ago

I'm having the same issue on another dataset (single channel output). They look like finger stains in my grayscale images (it's the same location in every image, but its intensity varies in different images).

Quasimondo commented 7 years ago

I have the suspicion that these artifacts might be caused by the SpatialBatchNormalization when using the default batchSize of 1. Since I am training my models with a bigger batchSize I have not seen these artifacts anymore.

royal-feng commented 7 years ago

@Quasimondo ,hello I have the same problem with you , and I want to ask for another question, I have seen the errG and errD and errL1, I don't know whether the GAN is get convergence ,but the errG,errD and errL1 is almost about 0.7 ,1.2, 0.02.And the test result is blurry, is it the lambda is set to 100 to balance the errG and errL1?And how to adjust the lambda,according to the mean errG and errL1,and let them in same magnitude?

Quasimondo commented 7 years ago

Sorry, I don't know. The only thing I can say is that in my experience lambda seems to balance between global and local structure. So if I remember correctly decreasing the lambda value will give you better texture details, but their locations might be off (e.g. an eye will move to a non-natural position) - overall the lambda value has quite an influence over the final look.

junyanz commented 7 years ago

I removed the batchnorm after the bottleneck. It should help address this problem.

Quasimondo commented 7 years ago

Thanks, I will try this. Do you know if there is a simple way in torch to remove (or at least disable) this batchnorm from a loaded pre-trained model that still contains it? Otherwise this solution will only work for new models that I train from scratch.

royal-feng commented 7 years ago

Increasing the lambda value may be equal to increasing the effort of errL1, and in that paper,L1_loss make the fake image more blurry, is my understanding right?And the the effort of the errG make more texture details to deceive the netD?

Cristy94 commented 7 years ago

Did you manage to solve this issue? I have trained a model for several days, it was getting better and better and suddenly artifacts started to appear (vertical colored 10px-wide dashed lines). Any way to remove them without starting the training from 0? :(

Quasimondo commented 7 years ago

No I did not look into ways how to fix then anymore since they seem to be only happening when using a BatchSize of 1. At least I have not run into them anymore since I've increased my batch sizes.

My theoretical approach to fix it would be to identify the "dead" weights or biases inside the affected layers and replace them with a random number or maybe the mean of the surrounding weights. The question is of course how to identify the address of the affected values.

kritin2 commented 4 years ago

It was happening with me even when I am using a batch size of 64. I am trying to convert face images to mask-wearing face images. What should I do?

mrgloom commented 3 years ago

It's interesting that we have square shaped artifacts here.

Seems there is some related discussion in StyleGan2 paper https://arxiv.org/pdf/1912.04958.pdf they say "Instance normalization causes water droplet like artifacts", but they show high activations in activation maps, not in the weights and usage of IN is different, here https://arxiv.org/pdf/1807.09251.pdf they say "We also observed that changing batch normalization in the generator by instance normalization improved training stability."

Maybe we can somehow regularize / smooth the weights / activations?

I removed the batchnorm after the bottleneck. It should help address this problem.

What motivation is behind this? Is it because of this?https://github.com/phillipi/pix2pix/commit/b479b6b7d37f9d7e87dce7f5e627dc3bb7b4a117#commitcomment-21887022

mykeehu commented 2 years ago

Have any solution for this problem? I use batch size to 4 now, but if there is a more secure solution, I would appreciate it. I am currently using this framework, and I have to use pix2pix, no other GAN (I want to replace an old trained model). I'm not a programmer, just need modify this.

I tried using InstanceNormalization instead of BatchNormalization in the downsample and upsample functions, but I couldn't replace it in the discriminator. Although the quality was nice, it produced quite extreme distortions in some cycles due to the BatchNormalization in the discriminator.