Open brightening-eyes opened 3 years ago
@brightening-eyes It seems a small change from Adam optimizer, while the test part is most of the work. Would you like to make a contribution?
@TristonC i would, but I don't know how optimizers are implimented in MXNet. (I don't know about create_state etc). if anyone tells me how I can go about it, I would be more than glad to do it.
Can it just be done in the python file first?
yes it can as far as I know, you should subclass the optimizer, but I don't know how I can go with the states and optimizer parameters.
@TristonC Hello, first time contributor to MXNet here. I'm working on a PR for this. Here's my branch. It's mostly a copycat of src/operator/contrib/adamw*
. I'm still working on test cases.
Question: Shall I send PR as is, or perhaps perform some code refactoring? I find rather high amount of code duplication upon working on this PR. Same goes for Adam-like optimizers such as STES, #18486. It might be good idea to generalize common parts of Adam-like optimizers.
@khaotik Thanks for contributing to the MXNet community.
For your question, either-way works for your first PR. We will have someone take a look at the change to see if the factory is needed to be done now.
@sandeep-krishnamurthy Do you have someone to help with the code review?
Description
this optimizer is like adam, but has some differences like adapting the step size in update direction
References
here is the reference to paper GitHub repository