Added RMSProp Optimizer subroutine

modern-fortran / neural-fortran

A parallel framework for deep learning

MIT License

409 stars 85 forks source link

Added RMSProp Optimizer subroutine #144

Closed Spnetic-5 closed 1 year ago

Spnetic-5 commented 1 year ago

Solves #136 This pull request adds an implementation of the RMSprop optimizer subroutine to the existing quadratic example.

Approach:

Initialized rms_weights and rms_gradients arrays of appropriate dimensions.
Added a nested loop over the network layers to update the weights using the RMSprop update rule.
Calculated rms_weights and rms_gradients using the decay rate and current weights/gradients.
Updated the weights using the RMSprop update rule: weights = weights - (learning_rate / sqrt(rms_weights + epsilon)) * gradients.

Spnetic-5 commented 1 year ago

Apologies for late reply @milancurcic

milancurcic commented 1 year ago

@Spnetic-5 I mostly re-wrote the subroutine so that now it compiles and converges. It's not using mini-batching; for simplicity, for now it's being applied after the entire batch of forward and backward passes.

I understand that this PR was challenging. It took me a bit to find the right approach. In your most recent commit, you made some changes and wrote "made suggested corrections", which made it sound like the PR was good to go. However, the example was not even compiling at this stage. Whenever you struggle with the implementation, please write a comment in the PR explaining where you got stuck and if you need help, rather than just leaving it with a short commit message.

Also please study the implementation in this PR. It introduces a new derived type to allow tracking a moving average of gradients over multiple epochs and for each layer. We are likely to use this approach for other optimizers that need a moving average logic.

Spnetic-5 commented 1 year ago

I apologize for the confusion caused by my commit message. It was not my intention to imply that the code was ready to go. But the code was compiling and running well in my PC, I'll make sure to provide detailed comments in the pull request in the future.

Thanks for the changes, I'll study those.

milancurcic commented 1 year ago

Thank you, @Spnetic-5, and no worries. I apologize for jumping the gun and finishing the implementation in this PR.

Going forward, would you like to give it a shot to continue the work in #139, or would you like to implement another optimizer in the quadratic fit example program? Recall that once we implement #139 for SGD, the new optimizers in quadratic will serve as prototype implementations for porting them into the library.

Spnetic-5 commented 1 year ago

Going forward, would you like to give it a shot to continue the work in #139, or would you like to implement another optimizer in the quadratic fit example program? Recall that once we implement #139 for SGD, the new optimizers in quadratic will serve as prototype implementations for porting them into the library.

Thank you, @milancurcic. I would like to work on #137 first. Once we have completed that, we can move on to #139 and then additional new optimizers.