casperkaae / parmesan

Variational and semi-supervised neural network toppings for Lasagne
Other
208 stars 31 forks source link

address NaN issues #32

Closed thjashin closed 8 years ago

thjashin commented 8 years ago

I suffered from several NaN issues when coding with parmesan utilities. These are the fixes. Note that in T.clip I use 1e-6 instead of 1e-8 because float32 doesn't distinguish between 1.0 and 1.0-1e-8.

thjashin commented 8 years ago

Do you mean like this?

casperkaae commented 8 years ago

Yes! just with eps=0.0 as default. Then we keep the default behaviour as it currently is

thjashin commented 8 years ago

done

wuaalb commented 8 years ago

Maybe it would be a good idea to put the values you had hardcoded before as recommendations (for float32) in the docstring? Esp. that Bernoulli requires 1e-6 is not completely obvious maybe.

I think for VAE I've also seen Gaussian inference network with small offset softplus for the variance output, which I guess accomplishes something similar.

casperkaae commented 8 years ago

@wuaalb, @thjashin, btw if you like eps=-1e6 as the default value then i'm also open for that solution?

wuaalb commented 8 years ago

I'm not really sure what's the best.

Btw, the iw_vae.py and iw_vae_normflow.py examples already include the 1e-6 clipping for Bernoulli log likelihood. So those should be modified as well if default clipping is used.

thjashin commented 8 years ago

Actually I prefer using 1e-6 as the default value but that might change behaviors of current examples, although I think that's probably not going to happen.

casperkaae commented 8 years ago

After a bit of though i think we should set the default behaviour as eps=0.0 to calculate the "correct" value by default.

Thanks for the help.

wuaalb commented 8 years ago

I noticed Lasagne tends to use epsilon rather than eps (likewise std vs. sd), FWIW.