FluxML / Optimisers.jl

Optimisers.jl defines many standard optimisers and utilities for learning loops.
https://fluxml.ai/Optimisers.jl
MIT License
74 stars 22 forks source link

convenience constructors #11

Closed CarloLucibello closed 2 weeks ago

CarloLucibello commented 3 years ago

It could be convenient to have an optimizer constructor also performing initialization:

opt, optstate = Nesterov(params(m), lr=0.1)  

instead of

opt = Nesterov(0.1)
optstate = Optimisers.init(opt, params(m))  
darsnack commented 3 years ago

Seems like a good idea to me. If this is something we want, can I suggest kwargs again for the internal parameters like LR? Not necessary to achieve this, but if we have non-parameter positional arguments, I think the interface is cleaner with kwargs for parameters.

mcabbott commented 2 years ago

With #30 this will read

optstate = Optimisers.setup(Nesterov(0.1), model)
for _ in 1:N
    grad = ...
    optstate, model = Optimisers.update!(optstate, model, grad)
end

and there is no further need to hold onto the optimiser struct alone.

Do we want anything else here? These are possible but perhaps confusing:

optstate = Nesterov(model, 0.1)

optstate = Nesterov(model; eta = 0.1)

Edit: also possible are:

optstate = Nesterov(0.1)(model)

optstate = Nesterov(rho=0.8)(model)
CarloLucibello commented 2 weeks ago

I think things are good as they are right now