Reproducing results on celebA

bitlater commented 5 years ago

@pajotarthur hi! I tried to reproduce the performance on celebA dataset, I kept all the parameters as you define. Unfortunately I haven't succeed in it - for up to 660 epochs of learning on 1 GPU MSE was about the same all the time (around 0.266) and output images were the same too (as you can see below). So my question is how one can reproduce your code e.g. on celebA images? Probably smth is wrong with default params of model? Thank you. train_664

phosgene89 commented 4 years ago

@bitlater I've just started training the model myself and it does appear to be learning a bit better than yours. Have you changed anything at all, except for turning off MongoDB usage and replacing "unsup_ir" with "unir"? In particular, did you change the batch size to enable the model to fit in the GPU? I recall reading that batch sizes influence GAN quality, with larger batch sizes being preferable.

Hope this helps.

QuantumMatters commented 2 years ago

Hi @bitlater, @phosgene89, and @pajotarthur, I'm working with some classmates to reproduce the results of this paper as part of a class project.

We are also struggling to reproduce the results described in the paper. We are also following the same hyperparameters as described in the paper. Over several experiments, we are seeing similar results as @bitlater described: MSE getting to around 0.26, 0.27, and no more improvement.

Any advice? Are there any differences in the code base you published here to git and what was used to create the results in your paper? What loss behavior over epoch did you observe in your experiments? Any guidance would be greatly appreciated.

I can share with you the code we're using if necessary--it's currently private since it's used for a school project. However, I have checked that the code is nearly identical to the code here in this repo. The only differences are made to get the code to run e.g. due to some apparent changes in PyTorch/Torchvision behavior. For example, in the discriminator backward pass, we had to add a .contiguous() call to address an issue where apparently some conv layer wouldn't take a view (more details in this thread) i.e. pred_fake = self.netD(self.fake_sample.detach().contiguous()). We also had to wrap the code that saves the images in with torch.no_grad i.e.

with torch.no_grad():
    vutils.save_image(ims, path, scale_each=True, normalize=True, nrow=dl.batch_size)

However, we are under the impression that these changes shouldn't affect the model performance, and there aren't any other significant code changes on our side.

Here are the results when using CelebA and remove pixel dark corruption (removes whole pixels):

You can see the reconstructed images are very similar to the original corruption (the measurement):

Here are the results when using CelebA and remove pixel corruption (removes pixel channel):

Again, the reconstructed images are very similar to the original corruption (the measurement):

We did not know the exact versions of the packages used in the paper's analysis, so we modeled our requirements.txt off of @phosgene89's fork (thank you!).

requirements.txt torch==1.7.0 torchvision==0.8.1 numpy==1.19.4 imageio==2.9.0 sacred==0.8.1 pymongo==3.11.0 scikit-image==0.17.2 scipy==1.5.4 pyyaml==6.0

UNIR-Anonymous / UNIR

Reproducing results on celebA #2