Closed thegodone closed 3 days ago
I can't access the link since it is behind a paywall but looking at your code above I think that you need to adjust the step not the new parameter which means that you probably need to update it as follows
# Step 3: Core Adam update
updated_param = super().apply_single(gradient, parameter, state)
delta = updated_param - parameter
...
return parameter + delta * adjusted_lr
In addition to the above, I think that this isn't necessarily an MLX bug so I will close this issue. If you have any reason to believe that there is an MLX bug that prevents convergence feel free to reopen the issue.
thank you it converge now (it is not very good but it works!). Thanks again!
Describe the bug trying to implement entropy adam version (source here: https://pub.aimind.so/enhancing-adam-with-gradient-entropy-for-optimizer-regularization-c1f05248c980) return an error To Reproduce
Include code snippet
I fix the bug it was needed to min vs mx.minimum for pure eval compilation but this optimizer modification does not converge at all it stay to the initial loss value for any epochs, Can someone help ?
Expected behavior A clear and concise description of what you expected to happen.
Desktop (please complete the following information):
Additional context Add any other context about the problem here.