Use zeros instead of fill in Adam

pkofod commented 8 months ago

I'd have to do the same in AdaMax

codecov[bot] commented 8 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (b0ba898) 84.90% compared to head (93a438c) 84.90%.

:exclamation: Current head 93a438c differs from pull request most recent head 409ddf9. Consider uploading reports for the commit 409ddf9 to get more accurate results

Additional details and impacted files

```diff @@ Coverage Diff @@ ## master #1075 +/- ## ======================================= Coverage 84.90% 84.90% ======================================= Files 46 46 Lines 3492 3492 ======================================= Hits 2965 2965 Misses 527 527 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

roflmaostc commented 8 months ago

Errors in line 83

Scalar indexing is disallowed.

Invocation of getindex resulted in scalar indexing of a GPU array.

This is typically caused by calling an iterating implementation of a method.

Such implementations *do not* execute on the GPU, but very slowly on the CPU,

and therefore are only permitted from the REPL for prototyping purposes.

If you did intend to index this array, annotate the caller with @allowscalar.

    error(::String)@error.jl:35
    assertscalar(::String)@GPUArraysCore.jl:103
    getindex(::CUDA.CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}, ::Int64)@indexing.jl:48
    update_state!(::NLSolversBase.OnceDifferentiable{Float32, CUDA.CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}, CUDA.CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}}, ::Optim.AdamState{CUDA.CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}, Float32, CUDA.CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}, CUDA.CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}, CUDA.CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}, Int64}, ::Optim.Adam{Float64, Optim.Flat})@adam.jl:83
    optimize(::NLSolversBase.OnceDifferentiable{Float32, CUDA.CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}, CUDA.CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}}, ::CUDA.CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}, ::Optim.Adam{Float64, Optim.Flat}, ::Optim.Options{Float64, Nothing}, ::Optim.AdamState{CUDA.CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}, Float32, CUDA.CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}, CUDA.CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}, CUDA.CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}, Int64})@optimize.jl:54
    optimize(::NLSolversBase.OnceDifferentiable{Float32, CUDA.CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}, CUDA.CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}}, ::CUDA.CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}, ::Optim.Adam{Float64, Optim.Flat}, ::Optim.Options{Float64, Nothing}, ::Optim.AdamState{CUDA.CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}, Float32, CUDA.CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}, CUDA.CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}, CUDA.CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}, Int64})@optimize.jl:36[inlined]
    var"#optimize#91"(::Bool, ::Symbol, ::typeof(Optim.optimize), ::NLSolversBase.InplaceObjective{Nothing, SwissVAMyKnife.var"#fg!#19"{SwissVAMyKnife.var"#L_VAM#18"{SwissVAMyKnife.var"#fwd2#12"{SwissVAMyKnife.var"#AS_abs2#11"{WaveOpticsPropagation.AngularSpectrum3{CUDA.CuArray{ComplexF32, 3, CUDA.Mem.DeviceBuffer}, Float32, CUDA.CUFFT.cCuFFTPlan{ComplexF32, -1, true, 3}}, Int64}, StepRangeLen{Float32, Float64, Float64, Int64}}, CUDA.CuArray{Bool, 3, CUDA.Mem.DeviceBuffer}, CUDA.CuArray{Bool, 3, CUDA.Mem.DeviceBuffer}, Tuple{Float32, Float32}, SwissVAMyKnife.var"#loss_f2#17"}}, Nothing, Nothing, Nothing}, ::CUDA.CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}, ::Optim.Adam{Float64, Optim.Flat}, ::Optim.Options{Float64, Nothing})@interface.jl:143
    optimize(::NLSolversBase.InplaceObjective{Nothing, SwissVAMyKnife.var"#fg!#19"{SwissVAMyKnife.var"#L_VAM#18"{SwissVAMyKnife.var"#fwd2#12"{SwissVAMyKnife.var"#AS_abs2#11"{WaveOpticsPropagation.AngularSpectrum3{CUDA.CuArray{ComplexF32, 3, CUDA.Mem.DeviceBuffer}, Float32, CUDA.CUFFT.cCuFFTPlan{ComplexF32, -1, true, 3}}, Int64}, StepRangeLen{Float32, Float64, Float64, Int64}}, CUDA.CuArray{Bool, 3, CUDA.Mem.DeviceBuffer}, CUDA.CuArray{Bool, 3, CUDA.Mem.DeviceBuffer}, Tuple{Float32, Float32}, SwissVAMyKnife.var"#loss_f2#17"}}, Nothing, Nothing, Nothing}, ::CUDA.CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}, ::Optim.Adam{Float64, Optim.Flat}, ::Optim.Options{Float64, Nothing})@interface.jl:139
    optimize_patterns(::CUDA.CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}, ::SwissVAMyKnife.WaveOptics{CUDA.CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, Float32, StepRangeLen{Float32, Float64, Float64, Int64}, Nothing}, ::SwissVAMyKnife.GradientBased{Optim.Adam{Float64, Optim.Flat}, Int64, typeof(abs2), Symbol, Float32})@optimization.jl:110
    macro expansion@[Local: 35](http://localhost:1234/edit?id=be680f46-beb6-11ee-191d-d736708bb810#)[inlined]
    macro expansion@[Local: 621](http://localhost:1234/edit?id=be680f46-beb6-11ee-191d-d736708bb810#)[inlined]
    top-level scope@[Local: 1](http://localhost:1234/edit?id=be680f46-beb6-11ee-191d-d736708bb810#)[inlined]

pkofod commented 8 months ago

Errors in line 83

ah yes, the isnan handling. that can also be rewritten.

roflmaostc commented 8 months ago

Seems to work, thanks!

(@main) pkg> st
Status `~/.julia/environments/main/Project.toml`
  [052768ef] CUDA v5.2.0
  [429524aa] Optim v1.9.0 `https://github.com/JuliaNLSolvers/Optim.jl#pkofod-patch-11`
  [e88e6eb3] Zygote v0.6.69

julia> using CUDA, Optim, Zygote

julia> y = CUDA.rand(10, 10)
10×10 CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}:
 0.989607  0.929248  0.38291   0.71276     0.35737   0.0329809  0.0459679  0.473346  0.689134   0.895997
 0.541529  0.219795  0.186824  0.786805    0.286709  0.981285   0.553156   0.800617  0.364326   0.21197
 0.423807  0.880008  0.963067  0.538527    0.557977  0.76693    0.819093   0.601645  0.0315003  0.265176
 0.47718   0.604049  0.517216  0.945743    0.714332  0.518855   0.241564   0.618935  0.32788    0.322779
 0.905727  0.858503  0.779593  0.110529    0.815396  0.613965   0.0746752  0.288452  0.347681   0.517876
 0.548341  0.164758  0.608109  0.00389199  0.172974  0.405213   0.805441   0.12773   0.601058   0.4747
 0.544496  0.761227  0.354748  0.670576    0.646628  0.745235   0.60533    0.850058  0.408118   0.489108
 0.792463  0.674849  0.178368  0.965902    0.627943  0.212537   0.239867   0.635791  0.254728   0.25109
 0.216118  0.408056  0.939535  0.887924    0.244255  0.223407   0.427071   0.842454  0.580088   0.345002
 0.789739  0.110236  0.701894  0.794049    0.721866  0.14752    0.342845   0.825494  0.200069   0.410702

julia> f(x) = sum(abs2, y .- sin.(x))
f (generic function with 1 method)

julia> g!(y, x) = y .= Zygote.gradient(f,  x)[1]
g! (generic function with 1 method)

julia> optimize(f, g!, CUDA.zeros((10,10)), Adam())
 * Status: failure (reached maximum number of iterations)

 * Candidate solution
    Final objective value:     5.761456e-01

 * Found with
    Algorithm:     Adam

 * Convergence measures
    |x - x'|               = 8.47e-01 ≰ 0.0e+00
    |x - x'|/|x'|          = 1.00e+00 ≰ 0.0e+00
    |f(x) - f(x')|         = NaN ≰ 0.0e+00
    |f(x) - f(x')|/|f(x')| = NaN ≰ 0.0e+00
    |g(x)|                 = 3.18e-01 ≰ 1.0e-08

 * Work counters
    Seconds run:   6  (vs limit Inf)
    Iterations:    10000
    f(x) calls:    10001
    ∇f(x) calls:   10001

pkofod commented 8 months ago

Hurray, thanks for testing !

roflmaostc commented 8 months ago

Thanks for adding so quickly!

JuliaNLSolvers / Optim.jl

Use zeros instead of fill in Adam #1075

Codecov Report