torch / optim

A numeric optimization package for Torch.
Other
197 stars 154 forks source link

Reverted to zero mean squared values init #130

Closed iassael closed 8 years ago

iassael commented 8 years ago

Reverted to zero mean squared values initialisation state.m.

Since the gradients get divided by this moving average it might be sensible to be initialised with ones, since it avoids making big steps in the beginning.

However, this update resulted a substantial decrease in the performance of many of my projects, and I'm sure it's breaking more.

soumith commented 8 years ago

i'll patch in a configurable value, that by default is 0.

soumith commented 8 years ago

@iassael @andreaskoepf see https://github.com/torch/optim/commit/0154acd51b80f97f5f52752c0a4f5af68d48b03f

iassael commented 8 years ago

Faster than his shadow. Thank you.

andreaskoepf commented 8 years ago

@soumith configurable initial mean is very reasonable. But I think the main "problem" for slow adaption of the true mean is the default alpha of 0.99 - as discussed before I think 0.9 would be more reasonable as default but probably changing that would also break existing projects. I have switched from rmsprop to Adam in most of my projects. @iassael do you have some models where rmsprop performs better than Adam?

iassael commented 8 years ago

@andreaskoepf in several Deep RL we have seen much better performance using RMSProp, but I'm sure with a problem dependent parameter search this could be averaged.