Closed tiagodsilva closed 3 months ago
Hi, thanks for the interest in our work! The trained neural net work $p\theta$ is approximately self-normalized. To get the true NLL in Table 1, we evaluate it based on how the model generates the data, i.e. either using the conditional network $p\phi$ or use the normalized conditionals derived from $p_\theta$ (such as in Appendix B.2 of https://proceedings.mlr.press/v235/liu24az.html).
I almost forgot about this issue, haha. But thanks for answering! Results in Table 1 are clear for me now. Very nice paper!
Thanks! Sorry I didn't get proper notification of your initial post. Hopefully it is not too late!
Hi!
Great paper with a very interesting approach to probabilistic modelling. I am trying, though, to wrap my head around how to account for the normalization constant and how do you ensure that $p_{\theta}$ is a normalized distribution (to properly evaluate the log-likelihood).
In the paper, you wrote as a footnote on page 4 that you can either enforce $p{\theta}(\square, \dots, \square) = 1$ or let $Z{\theta} = p_{\theta}(\square, \dots, \square)$ be the normalization constant. Which approach did you pursue?
In particular, how do you evaluate the NLL in, e.g., Table 1?
Thanks!