ericjang / gumbel-softmax

categorical variational autoencoder using the Gumbel-Softmax estimator
MIT License
425 stars 101 forks source link

Is there empirically good temperature? #2

Closed wsjeon closed 7 years ago

wsjeon commented 7 years ago

Thank you for your interesting work! :)

I wonder if the temperature very close to 0 (e.g., 1e-20) makes the backpropagation error in practice.

In addition, is there a proper temperature you recommend?

ericjang commented 7 years ago

Empirically, it depends on the number of classes. For K=10, we find that a fixed temperature between 0.5 and 1.0 works pretty well. You can squeeze out some additional performance by gradually annealing the temperature to 0.5 over the course of training. If you ultimately care about discrete inference, make sure you monitor validation accuracy on quantized (i.e. hard=True) graph.

wsjeon commented 7 years ago

Thanks :)