tensorflow / tensor2tensor

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
Apache License 2.0
15.6k stars 3.51k forks source link

MultiStep Optimizer #1863

Closed jchwenger closed 4 years ago

jchwenger commented 4 years ago

Description

Hi there,

I've been working on a project still in TF 1.4 (oh dear, I know! An attempt to turrn this into working multit-gpu code, perhaps the last thing I'll do with this before moving to another framework --_--", (yes, Trax for instance :D)), and was glad to be able to import the AdaFactor optimizer from T2T. I'm considering now experimenting with the multistep Adam, and I noticed the two versions, multistep_with_adamoptimizer.py and multistep_optimizer.py, the one multistep-Adamizing the superclass tf.train.Optimizer, and the other 'only' multistepping the tf.train.AdamOptimizer. Is there any difference between these two, and which one would you recommend I use?

Thanks tons in advance! Jeremie

...

Environment information

TF 1.4, to work with V100s. Ubuntu 18.04.

jchwenger commented 4 years ago

Hi again, I hadn't actually seen this, which explains it all.