I've been working on a project still in TF 1.4 (oh dear, I know! An attempt to turrn this into working multit-gpu code, perhaps the last thing I'll do with this before moving to another framework --_--", (yes, Trax for instance :D)), and was glad to be able to import the AdaFactor optimizer from T2T. I'm considering now experimenting with the multistep Adam, and I noticed the two versions, multistep_with_adamoptimizer.py and multistep_optimizer.py, the one multistep-Adamizing the superclass tf.train.Optimizer, and the other 'only' multistepping the tf.train.AdamOptimizer. Is there any difference between these two, and which one would you recommend I use?
Description
Hi there,
I've been working on a project still in TF 1.4 (oh dear, I know! An attempt to turrn this into working multit-gpu code, perhaps the last thing I'll do with this before moving to another framework --_--", (yes, Trax for instance :D)), and was glad to be able to import the AdaFactor optimizer from T2T. I'm considering now experimenting with the multistep Adam, and I noticed the two versions,
multistep_with_adamoptimizer.py
andmultistep_optimizer.py
, the one multistep-Adamizing the superclasstf.train.Optimizer
, and the other 'only' multistepping thetf.train.AdamOptimizer
. Is there any difference between these two, and which one would you recommend I use?Thanks tons in advance! Jeremie
...
Environment information
TF 1.4, to work with V100s. Ubuntu 18.04.