Open howardyclo opened 6 years ago
After I read the paper "Adversarial Contrastive Estimation" (#23), which replaces the original fixed noise generator in noise contrastive estimation (NCE) with the dynamic noise generator using with GAN training, some questions like "How does NCE relate to GANs?", "NCE is closely related to MLE, and how about GANs?" naturally rises in my mind.
This paper compares MLE, NCE, GAN and gives several initial answers to:
In conclusion, the analysis shows that GANs are not as closely related to NCE as previously believed.
Notes:
- You need to also read Notes on NCE (the last comment at #23) in order to understand this paper. This notes is a supplementary to #24.
- The gradient of NCE can be approximated to the gradient of MLE (as shown in the paper "A fast and simple algorithm for training neural probabilistic language models").
Sorry for the inconsistent notation.
Note: Asymptotically consistent estimator: See https://en.wikipedia.org/wiki/Consistent_estimator
Note: See the derivation in Notes on NCE (#23).
Note: There is a error in the derivation of SCE's expected gradent: The equation 1/2 E_{x~p_g} log (p_g(x)) should be 1/2 E_{x~p_g} ∂/∂θ log (p_g(x)).
See derivation in the paper.
Metadata