dgriff777 / rl_a3c_pytorch

A3C LSTM Atari with Pytorch plus A3G design
Apache License 2.0
563 stars 119 forks source link

eps for Adam #29

Open jsuit opened 6 years ago

jsuit commented 6 years ago

Is there a reason why the default for eps in the adam optimizer is so high? Currently, it is 1e-3 [line 104 in shared_optim.py]. Usually, it's around 1e-08. Just wanted to see if this was done intentionally (e.g., it works better than when it is lower) or not.

dgriff777 commented 5 years ago

The epsilon value 1e-3 is actually often my default choice for adam optimizer and I find it helps with with stability. Although 1e-08 is often listed as default for adam its not a strongly suggested best choice and its commonly known to not be the best choice in many cases and in my experience has never been best choice in my various use cases.

ppwwyyxx commented 5 years ago

Also, in https://www.tensorflow.org/api_docs/python/tf/train/AdamOptimizer:

The default value of 1e-8 for epsilon might not be a good default in general. For example, when training an Inception network on ImageNet a current good choice is 1.0 or 0.1.