Thanks for open-sourcing your great work. I'm slightly confused about the derivation of Eq.13 in the paper, which derives the joint distribution as:
$$p\theta(x,y)=\frac{\exp(f\theta(x)[y])}{Z(\theta)}$$
from the classifier $p\theta(y|x)=\frac{\exp(f\theta(x)[y])}{\sum{y'}\exp(f\theta(x)[y'])}$ and the marginal $p\theta(x)=\frac{\exp(-E\theta(x))}{Z(\theta)}$. At my first glance, it seems like this equation only holds if you have $\sum{y'}\exp(f\theta(x)[y'])=\exp(-E_\theta(x))$. However, according to [1], it seems like this is actually what they derive from the joint distribution, i.e., they derive the energy-based generative model from the classifier.
I am a little confused about this formulation and would really appreciate if you could please provide a full derivation of how the joint distribution is derived. Thanks again for releasing your work and in advance for your clarification.
Reference
[1] Grathwohl, Will et al. “Your Classifier is Secretly an Energy Based Model and You Should Treat it Like One.” ArXiv abs/1912.03263 (2019): n. pag.
Thanks for open-sourcing your great work. I'm slightly confused about the derivation of Eq.13 in the paper, which derives the joint distribution as: $$p\theta(x,y)=\frac{\exp(f\theta(x)[y])}{Z(\theta)}$$ from the classifier $p\theta(y|x)=\frac{\exp(f\theta(x)[y])}{\sum{y'}\exp(f\theta(x)[y'])}$ and the marginal $p\theta(x)=\frac{\exp(-E\theta(x))}{Z(\theta)}$. At my first glance, it seems like this equation only holds if you have $\sum{y'}\exp(f\theta(x)[y'])=\exp(-E_\theta(x))$. However, according to [1], it seems like this is actually what they derive from the joint distribution, i.e., they derive the energy-based generative model from the classifier.
I am a little confused about this formulation and would really appreciate if you could please provide a full derivation of how the joint distribution is derived. Thanks again for releasing your work and in advance for your clarification.
Reference [1] Grathwohl, Will et al. “Your Classifier is Secretly an Energy Based Model and You Should Treat it Like One.” ArXiv abs/1912.03263 (2019): n. pag.