Update report Issue #84

wanhaozhou commented 5 years ago

Issue #84 Title: Improving Generalization and Stability of Generative Adversarial Networks Site: https://epfml17.hotcrp.com/paper/491 Link to code: https://github.com/wanhaozhou/Machine-Learning/blob/master/project2/Improving_Generalization_And_Stability_Of_GAN.ipynb

reproducibility-org commented 5 years ago

Hi, please find below a review submitted by one of the reviewers:

Score: 6 Reviewer 1 comment : This report makes an effort to reproduce the main results in the paper titled Improving Generalization and Stability of Generative Adversarial Networks by Thanh-Tung et al. (2019).

This reproducibility work focuses mostly on understanding the theoretical contributions provided in the original paper which is understandable given the nature of the original paper is two better understand the role of 0-GP in GAN optimization. Thus this work provides a clear introduction and review of preliminary material and one of the main propositions provided in the original paper. While these sections are interesting they do not provide any further insight into better understanding the paper and are mostly summarizations.

As the original paper contains a very detailed outline of network architectures as well as exact hyperparameter settings it makes for easy testing for reproducibility purposes. With that said, this reproducibility work clearly lacks one of the main experiments on a mixture of 8 Gaussians presented in the original work. As this is a toy and synthetic experiment often used in many other GAN papers it is unclear why these experiments were omitted in this reproducibility effort. Further, with regards to the MNIST experiments the results for 0-GP as proposed in the original paper are qualitatively much better than this reproducibility work. There is no attempt to explain why this is the case or possible ablation studies with different hyperparameter setting to better understand the cause of such differences. Finally, this work chooses to tackle CIFAR-10 as their third image dataset, one which was not considered the original authors. The authors explain their choice of CIFAR-10 over ImageNet as due to a lack of computational resources which this reviewer finds as a credible explanation. However, the number of experiments and analysis on CIFAR-10 is again minimal. It would have been interesting to see how robust 0-GP is compared to other GAN’s with regards to different choices in hyperparameters and settings. Particularly, questions like how robust are D and G learning rates in a TTUR schedule and number of D updates vs. G updates are interesting to consider as these are typical design decisions taken when training many different other GANs.

Confidence : 4

reproducibility-org commented 5 years ago

Hi, please find below a review submitted by one of the reviewers:

Score: 6 Reviewer 2 comment : The report attempts to reproduce the paper "Improving Generalization and Stability of Generative Adversarial Networks". The report includes a nice summary (and motivation) for the original paper indicating that the authors understood the key problem statement. While the paper reproduces a subset of results (given the original paper is also very detailed), it would be useful to consider some more hyperparameter settings/ablations specially for the CIFAR10 dataset as it was not a part of the original paper. The report is well written in general. Confidence : 2

reproducibility-org commented 5 years ago

Hi, please find below a review submitted by one of the reviewers:

Score: 8 Reviewer 3 comment : TA Review

The report is well-written with emphasis on both the theoretical and experimental parts of the reviewed paper. The authors explain the generalization problems with GANs and the various gradient penalty schemes to solve it. The motivation to use gradient penalty is clearly explained. The authors try the experiments on new datasets as well, which is a good way to verify the results.

While gradient penalties are well-motivated, the use of one-centred gradient penalty is not clear. The authors say it "was proposed to fulfil the requirement of Lipschitz condition instead of the originally used gradient weight clipping technique", but without any description of the weight clipping technique, this does not make sense.
From the discussion section, it seems that the conclusion is that 0-GP performs better than 0-GP sample, which is also the observation in the main paper. In that context, I do not understand Observation 2 on page 7.
In Section 3A, the points 1 and 2 can be written more formally as lemmas.
For gradient explosion, one should define the explosion property (even if trivial) before the proof.
The formulation of the optimization problem is weirdly written. Define the loss function first and then the min-max problem. In Eqn 1 and 2, the RHS does not have the min-max part.
Finally, in point 2 in conclusion, you seem to have misunderstood the point in the paper when it says 1-GP improves generalization. The statement is in reference to Fig 1 in the paper and the authors mention that it is not necessarily true for a higher dimensional space.
The code is well-written, but modularizing it would be better.
The results are reproducible mostly.
On running an experiment for the synthetic Gaussian case, I found that 1-GP gives a smoother values surface than others, similar to the observation from Fig 1 in the paper. Perhaps this is due to the low dimensionality of the data?

NB: This TA review has been provided by the institution directly and authors have communicated with the reviewers regarding changes / updates.

Confidence : 4

reproducibility-challenge / iclr_2019

Update report Issue #84 #162