DarthSid95 / RumiGANs

Code accompanying the NeurIPS 2020 submission "Teaching a GAN What Not to Learn."
MIT License
32 stars 5 forks source link

Questions about the paper: Teaching a GAN What Not to Learn #2

Closed choyi0521 closed 3 years ago

choyi0521 commented 3 years ago

I have a few questions about the derivations and the experiments in the paper.

Q1. p^*_g in Lemma 3.1 is less than 0 when (1+\alpha^-) p^+_d(x) - \alpha^-p^-_d(x) < 0. How can I understand the derivation in this condition?

Q2. According to the paper, the authors set \alpha^-=-0.2 in Rumi-SGAN on MNIST experiment (positive: 1,2,4,5,7,9, negative: 0, 2, 3, 6, 8, 9). However, the generator should produce both positive and negative samples because p^*_g(x)>0 on the images of all numbers. Why are the results that Rumi-SGAN produces positive class only desired?

DarthSid95 commented 3 years ago

Hey,

Sorry that I took too long to reply. I has some personal stuff going on in May and sorta forgot to keep track of this place. Anyways, I realize its been like 2 months already, but if the doubt's still there, I'll answer them here. At the very least, it might help someone out in the future.

For Q1, there are additional conditions on \alpha^+ and \alpha^- that prevent this from occurring. (i.e. \alpha^+ in [0,1] and \alpha^- \geq \alpha^+ - 1, which essentially sets \alpha^- \in [-1,0] making \p_g^* a convex combination of two distributions which does always satisfy non-negativity with integral equal to 1.

As for Q2, I'm not too sure where this doubt stems from. All the images from Rumi-SGAN are in the Supplementary, and even for the few sample images there, there seem to be negative class samples that leak into the generator output (cf. Supplementary Fig.1 (b) Row.3 Cols.1 or Row.3 Col.7 for an easy to identify case where a Rumi-GANs with even digits as positive has odd digits show up. For the overlapping case also, I do see a random '0' show up on Supp, Fig.2 (b) Row.2Col.9). It's possible that when looking at a lot more images, more instances of class overlap may be visible, but in the paper, we didn't particularly look to quantify if the ratio of samples did indeed match the mixing ratio of \beta^+ and \beta^-.

choyi0521 commented 3 years ago

Thank you for the reply!

I'm still confused about the meaning of "Not to learn" in the title. For the above case (positive: 1,2,4,5,7,9, negative: 0,2,3,6,8,9), the overlapped numbers are 2, 9. I wonder if RumiGAN can sample positive numbers while excluding such overlapped numbers in some configuration.(ex. 1,4,5,7) This is what "not to learn" means as I understood it. At least under the conditions answered in Q1(\alpha^- \in [-1,0]), this seems impossible.

DarthSid95 commented 3 years ago

Your understanding of "What not to learn" is absolutely correct. Before I start things off, there are two points to make: RumiGANs are just "the idea." Rumi-SGAN and Rumi-LSGAN are the specific instances that use the original GAN's cross-entropy, and the LSGAN's least-squares losses, respectively (and in the supplementary, Rum-WGAN, using the WGAN-GP loss).

The thing is, and we discuss this in Sec3.3, The Rumi-SGAN (or Rumi-WGAN also) can't perfectly separate out the two classes, and the "perfect" separation only occurs when those models are trained exclusively on those classes' samples alone. These corner cases are nothing special, and just refer to typical GAN training.

However, for the Rumi-LSGAN, there are certain choices of \betas and labels (a,b^+,b^-,c) that result in learning only p_d^+ or p_d^-, as the case may be. This is the reason why all results in the main paper pertain to Rumi-LSGAN, while we push the "not-so-interesting" discussions on Rumi-SGAN and Rumi-WGAN to the Supplementary. (The experiments in the Supp also reinforce these ideas. Rumi-LSGAN can successfully learn only p_d^+, while in Rumi-SGAN or Rumi-WGAN, that doesn't occur.)

So yes, your conclusion is correct, and also what we conclude in Sec3.3: We can't achieve perfect separation with the formulation in Lemma3.1, but we can achieve it with the formulation in Lemma3.2

While we claim that objectively, Rumi-LSGAN would be the best pick if we want to learn only a positive class, there could definitely be instances where we might be alright with a bit of mixing. (Face image datasets come to mind, where delta-overlapping features aren't going to be that visually identifiable as in the case of, say, MNIST)

choyi0521 commented 3 years ago

I appreciate your sincere explanation. Now that everything has been resolved, I will close the issue.