apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.77k stars 6.79k forks source link

adding support for adaBelief optimizer #19950

Open brightening-eyes opened 3 years ago

brightening-eyes commented 3 years ago

Description

this optimizer is like adam, but has some differences like adapting the step size in update direction

References

here is the reference to paper GitHub repository

TristonC commented 3 years ago

@brightening-eyes It seems a small change from Adam optimizer, while the test part is most of the work. Would you like to make a contribution?

brightening-eyes commented 3 years ago

@TristonC i would, but I don't know how optimizers are implimented in MXNet. (I don't know about create_state etc). if anyone tells me how I can go about it, I would be more than glad to do it.

TristonC commented 3 years ago

Can it just be done in the python file first?

brightening-eyes commented 3 years ago

yes it can as far as I know, you should subclass the optimizer, but I don't know how I can go with the states and optimizer parameters.

khaotik commented 3 years ago

@TristonC Hello, first time contributor to MXNet here. I'm working on a PR for this. Here's my branch. It's mostly a copycat of src/operator/contrib/adamw*. I'm still working on test cases.

Question: Shall I send PR as is, or perhaps perform some code refactoring? I find rather high amount of code duplication upon working on this PR. Same goes for Adam-like optimizers such as STES, #18486. It might be good idea to generalize common parts of Adam-like optimizers.

TristonC commented 3 years ago

@khaotik Thanks for contributing to the MXNet community.

For your question, either-way works for your first PR. We will have someone take a look at the change to see if the factory is needed to be done now.

@sandeep-krishnamurthy Do you have someone to help with the code review?