FluxML / Optimisers.jl

Optimisers.jl defines many standard optimisers and utilities for learning loops.
https://fluxml.ai/Optimisers.jl
MIT License
72 stars 20 forks source link

Use `eltype(x)` everywhere, ignore `typeof(η)` #151

Closed mcabbott closed 1 year ago

mcabbott commented 1 year ago

First commit should fix #150, by working all in Float32. Chooses to do this by always following the momentum from setup, not the x, but perhaps that's messy? changed to eltype(x) now.

Since the type of η is now ignored, perhaps we should just remove all the type parameters? Then e.g. Adam(0) == Adam(0.0), fixes #119, closes #120. Also closes #86 (redundant).

Not all rules are changed yet, RFC?

CarloLucibello commented 1 year ago

we should just remove all the type parameters?

definitely, we also get nicer prints of the opt_state has a bonus

ToucheSir commented 1 year ago

If using the eltype of momentum is too restrictive, we could always change to some promoted eltype of various parts of input + state (e.g. momentum + param + gradient eltypes). Otherwise 👍 from me though, nice to see that this works!

darsnack commented 1 year ago

We removed Descent{T}(…) type constructors, so unfortunately, yes.

mcabbott commented 1 year ago

In fact Descent{T} was the one I left, since IIRC someone had a paper doing "gradient descent" with complex learning rate.

I doubt anyone types that, but the concern is things like BSON, right?

mcabbott commented 1 year ago

I merge this now, we can think about whether anything else wants to be ganged into a breaking release.

CarloLucibello commented 1 year ago

Tagging a new breaking release now since this fixes a lot of problems and I don't see any breaking change on the horizon.