JuliaSmoothOptimizers / FluxNLPModels.jl

Other
6 stars 2 forks source link

Speed up the grad #28

Open farhadrclass opened 8 months ago

farhadrclass commented 8 months ago

I noticed we can improve our grad and objgrad method using the following code

# Calculate the gradient of the objective
  # with respect to the parameters within the model:
  grads = Flux.gradient(model) do m
      result = m(input)
      loss(result, label)
  end

Currently we have :

function NLPModels.objgrad!(
  nlp::AbstractFluxNLPModel{T, S},
  w::AbstractVector{V},
  g::AbstractVector{V},
) where {T, S, V}
  @lencheck nlp.meta.nvar w g
  x, y = nlp.current_training_minibatch

  if (eltype(nlp.w) != V)  # we check if the types are the same,
    update_type!(nlp, w)
    g = V.(g)
    if eltype(x) != V
      x = V.(x)
    end
  end

  increment!(nlp, :neval_obj)
  increment!(nlp, :neval_grad)
  set_vars!(nlp, w)

  f_w = nlp.loss_f(nlp.chain(x), y)
  g .= gradient(w_g -> local_loss(nlp, x, y, w_g), w)[1]

  return f_w, g

We could use

This way we won't need local objective call,

reference: https://fluxml.ai/Flux.jl/stable/training/zygote/#Zygote.withgradient-Tuple{Any,%20Vararg{Any}}

Note: I will be doing this as a pr soon but thought it would be nice to have a guide here just in case

farhadrclass commented 8 months ago

@tmigot what do you think of this ?

tmigot commented 8 months ago

This looks interesting indeed