distillpub / post--handwriting

Four Experiments in Handwriting with a Neural Network
https://distill.pub/2016/handwriting/
Creative Commons Attribution 4.0 International
166 stars 54 forks source link

The "Variation" parameter does not actually correspond to Boltzmann temperature #6

Open samuela opened 7 years ago

samuela commented 7 years ago

According the source code (https://github.com/distillpub/post--handwriting/blob/master/public/assets/model/model.js#L137) the variation parameter actually corresponds to applying to the temperature to the mixture probabilities, and multiplying all of the sigma_x and sigma_ys by the temperature. But this is not the same as adjusting a mixture of Gaussians by the temperature. This can easily be seen by considering a mixture of a single 1d Gaussian. The standard deviation of the updated distribution will be sigma * sqrt(T), not sigma * T.

colah commented 7 years ago

@hardmaru -- can you look at this?

hardmaru commented 7 years ago

That's a good point, and thanks for the catch. And in fact, in the Sketch-RNN paper, we have changed the definition to use sigma * sqrt(\tau), but have not updated the handwriting models accordingly. A pull request has been created to resolve this issue.

samuela commented 7 years ago

@hardmaru I'm not sure this quite solves the issue though. Unless I'm very mistaken

image

is not proportional to

image

I guess that it's a very sensible parameter to adjust and is easily understandable as the temperature of both the \pi_k and mixture component distributions. But it's not the same thing as the Boltzmann temperature of the full mixture distribution, as the footnote suggests.

hardmaru commented 7 years ago

In the footnote "Temperature is most commonly discussed in Boltzmann distributions, but can be generalized to all probability distributions", it is not implied that the temperature definition used here is the Boltzmann temperature for the full mixture distribution. We can consider rephrasing that sentence to avoid misunderstanding.