Closed njtierney closed 1 month ago
rms_prop
uses learning rate of 0.1 in {greta} but in TF docs it uses 0.001.Should we remove the following optimizers, since they aren't supported in TF2?
I'm not confident that the new interface to bfgs
and nelder_mead
will work, but currently I cannot test this as we still need to do fundamental changes to the internals of greta to work with eager mode.
It looks like there is a new interface to optimizers in TF 2.9?
This has been resolved as of https://github.com/njtierney/greta/commit/eef698f23effbc97595d61c3377cdfc1c8686de1
I'm going through an updating the optimizers, related to #306 and #482 and wanted to list the changes I've been making along with questions. See https://github.com/njtierney/greta/tree/swap-optimizers for the current pull request. Note that this is based off of #482, as I'm trying to make each PR a little bit more modularised and easier to review. I'm not sure the best way to display those changes on github.
gradient_descent
) gains arguments formomentum
andnesterov
. See python docsadadelta
optimizer in {greta} has default ofrho = 1
, but in the Python documentation this is 0.95 - should we change the default value?adagrad
optimizer in {greta} has defaultlearning_rate
of 0.8, but in the TF docs it is 0.0001AdagradDAOptimizer
is not made available in tensorflow 2.0, so I'm not sure if we should get rid of it?adam
optimizer is 0.001 in TF docs not 0.1, as in {greta}.ftrl
uses learning_rate = 0.001 not 1 - docs