kefirski / pytorch_RVAE

Recurrent Variational Autoencoder that generates sequential data implemented with pytorch
MIT License
357 stars 87 forks source link

Coefficient for cross entropy term. #2

Closed ruotianluo closed 7 years ago

ruotianluo commented 7 years ago

https://github.com/analvikingur/pytorch_RVAE/blob/master/model/rvae.py#L110

How do you get the coefficient 79?

kefirski commented 7 years ago

It is not scientific justified approach, but just engineering hack –– when you optimize ELBO, you have not to average log p(x|z) over the sequence length (like it's common in language modelling tasks) in order not to collapse model, thus you have to feed model with constant sized sequences filled with huge amount of padding tokens –– I had experimented with various approaches that will let me train VAE with different sized sequences e.g averaging whole ELBO with KL-Divirgence but found adding coefficient 79 most effective.

In general, I was trying to scale NLL. Thus I had grid-searched coefficients that are close to max sequence length that is equal to 83 (or something close to it, I forgot it already)

ruotianluo commented 7 years ago

Thank you very much.