Reccomendation for the `eps_root` setting for differentiating through the Adam optimizer

google-deepmind / optax

Optax is a gradient processing and optimization library for JAX.

https://optax.readthedocs.io

Apache License 2.0

1.65k stars 181 forks source link

Reccomendation for the `eps_root` setting for differentiating through the Adam optimizer #882

Closed itk22 closed 6 months ago

itk22 commented 6 months ago

Dear Optax team, I am working on implementing Model-Agnostic Meta-Learning in my project, and I noticed that setting the inner loop optimizer to the default Adam optimizer in Optax results in nan values in the meta-gradients. This is covered well in the documentation, which mentions that the eps_root should be set to a small constant to avoid dividing by zero when rescaling. Could you please recommend a good default value for eps_root in a meta-learning scenario?

holounic commented 6 months ago

Hi Igor, thanks for reaching out! 1e-8 might be a suitable choice for this case.

itk22 commented 6 months ago

Thank you, I will keep on using this for my experiments then :)