Open qzhu2017 opened 4 years ago
Optim.jl may already have this.
On Thu, Aug 6, 2020 at 9:33 AM Qiang Zhu notifications@github.com wrote:
Although I know most of people use ADAM or SGD for the optimization of weight in NN. However, for some small NN architecture, the optimization method with line search (e.g., LBFGS) would be far more efficient. I wonder if the Knet developers would be interested in implementing LBFGS?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/denizyuret/Knet.jl/issues/590, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAN43JRAGCZDYPYGAQUKDL3R7JFCBANCNFSM4PWH6W5A .
Yes, I know. Can I using it with Kent? I tried it. But this is not very obvious. To be more specific, optim.jl needs to know the following to optimize a function of f(x).
Both x and g need to be converted to a sort of 1D array. However, the main trouble is g. My student and I tried to transform g to something which can be accepted by optim.jl. But we failed. Not sure if someone has such experience.
x = Param(Array(...))
J = @diff f(x)
g = grad(J,x)
Should give you a g
with the exact type/shape as x
. See @doc AutoGrad
.
Although I know most of people use ADAM or SGD for the optimization of weight in NN. However, for some small NN architecture, the optimization method with line search (e.g., LBFGS) would be far more efficient. I wonder if the Knet developers would be interested in implementing LBFGS?