depthfirstlearning / depthfirstlearning.com

Other
195 stars 29 forks source link

Added Basic MNIST GAN for Saturating and Non-Saturating losses #17

Closed maxisawesome closed 6 years ago

maxisawesome commented 6 years ago

I removed the sentence saying "Let us know if you did this!" but you might wanna keep that. Lemme know if the notebook is good enough. I couldn't get the bundle to run locally, but I'm 95% sure the markdown I wrote will be fine. My b

maxisawesome commented 6 years ago

I added the DFL logo to the readme! It's a little blurry, but nbd to me. SVG's are kinda odd on github, so I didn't use it, I used a smaller png instead.

cinjon commented 6 years ago

Thanks!

cinjon commented 6 years ago

Also, would you mind uploading a copy of the colab as an ipynb? Then we could change the link to reflect it. This would be helpful because then there's always a static version.

maxisawesome commented 6 years ago

By bundle I mean I ran bundle exec jekyll serve to display the page locally but I couldn't update jekyll correctly (IIRC). I added a small "Max wrote this" and "End what Max wrote" to the ipython notebook, and other than that just a few added comments in the training loop. I just put the notebook in the main directory - wasn't sure of a more specific place that was better.

cinjon commented 6 years ago

It's hard for me to comment on the ipynb directly, posting here instead:

  1. "It is responsible for creating convincing images that are good enough to fool the discriminator." --> remove "are good enough to".
  2. "so the GAN is a pretty simple neural network (there's plenty of other stuff going on to make things complex, though!). It consists only of fully connected layers." --> "so the GAN is a pretty simple neural network consisting of only fully connected layers."
  3. "good results.We use tanh" --> "good results. We use tanh"
  4. "The discriminator is also constucted of only fullly connected layers, with some dropout to help training."
    • s/constucted/constructed
    • s/fullly/fully
    • "... connected layers, with some dropout to help training." --> "... connected layers. It additionally uses dropout to help with generalization."
  5. Remove "The two networks just need to be constructed to perform their duties of generating and discriminating."
  6. In the review of the original GAN paper:
    • there are some bad parts importing in the LaTeX. It says "Unknown character" repeatedly.
    • while it's not a bad idea to include this review, it would probably be better if you rewrote and reworded it here instead of pasting the image.
    • remove "It is included below."
    • "It was helpful to remind myself what D(x) returns: a probability that x is a real image." --> "D(x) returns its probability that x is a real image."
    • "Intuitively, we want D(x), where x is a real image, to be high, while we want D(G(z)) to be low, where z is a noise vector, G(z) is that noise vector run through the generator, and thus a fake image." --> "Intuitively, if x is a real image and z is a noise vector, we want D(x) to be high and D(G(z)) to be low."
    • I think we can get rid of this line as well because it's a rehash of what's in the copied image. --> "Note that we are "ascending" (ie maximizing) for the discriminator updates while we are "descending" (ie minimizing) for the generator updates, consistent with the definition of the "minimax" game defined above. The summation part of the updates is simply averaging over the batch."
  7. s/"vectors of probabilities"/"probability vectors"
  8. "The first is the probabilities that a batch of real images are real (want this to be high), and the second is the probability a batch of fake images are real (want this to be low)." --> "The first are the probabilities for a batch of real images being real, and the second are the probabilities a batch of fake images are real."
  9. Add periods to the second two sentences in Discriminator Loss. 10 . Generator loss:
    • "The generator which has two options of loss function." --> "The generator has two options of loss function."
    • I think it would be better to get rid of the saturating loss as people don't use it in practice.
maxisawesome commented 6 years ago
  • there are some bad parts importing in the LaTeX. It says "Unknown character" repeatedly.

I don't know... I don't see this. I added some LaTeX of the gradients. Let me know if you see that properly.

  • while it's not a bad idea to include this review, it would probably be better if you rewrote and reworded it here instead of pasting the image.

I removed the image and wrote in the gradient updates with some brief explanation.

  • I think we can get rid of this line as well because it's a rehash of what's in the copied image. --> "Note that we are "ascending" (ie maximizing) for the discriminator updates while we are "descending" (ie minimizing) for the generator updates, consistent with the definition of the "minimax" game defined above. The summation part of the updates is simply averaging over the batch."

Deleted the image, so I added a little bit of this back in.

  • I think it would be better to get rid of the saturating loss as people don't use it in practice.

I added a comment that says "While the saturating loss appears in the original GAN paper, it is no longer used in practice. It is included for completeness." I can also just delete it if you'd like, the original request was to demonstrate the difference between the losses, which was why I put it in there at all.

cinjon commented 6 years ago

Thanks for taking care of all of those! lgtm now. (I still see the unknown errors here, but when I loaded it into a jupyter display, they were gone. So it might be a Github issue.)