FluxML / Zygote.jl

21st century AD
https://fluxml.ai/Zygote.jl/
Other
1.46k stars 209 forks source link

`Ref` and broadcasting issue #1500

Open isentropic opened 4 months ago

isentropic commented 4 months ago
julia> gradient(x -> evalpoly(x, (1,2,3.5)), 2.0)
(16.0,)

julia> gradient(x -> evalpoly(x, [1,2,3.5]), 2.0)
(16.0,)

julia> gradient(x -> sum(evalpoly.(x, Ref((1,2,3.5)))), [2.0])
([16.0],)

julia> gradient(x -> sum(evalpoly.(x, Ref([1,2,3]))), [2.0])
([14.0],)

julia> gradient(x -> sum(evalpoly.(x, Ref([1,2,3.5]))), [2.0])
ERROR: BoundsError: attempt to access Float64 at index [2]
Stacktrace:
  [1] getindex
    @ ./number.jl:98 [inlined]
  [2] #1349
    @ ~/.julia/packages/ChainRules/Gw0tZ/src/rulesets/Base/array.jl:53 [inlined]
  [3] ntuple
    @ ./ntuple.jl:19 [inlined]
  [4] vect_pullback
    @ ~/.julia/packages/ChainRules/Gw0tZ/src/rulesets/Base/array.jl:53 [inlined]
  [5] (::Zygote.ZBack{ChainRules.var"#vect_pullback#1350"{3, Tuple{…}}})(dy::Float64)
    @ Zygote ~/.julia/packages/Zygote/jxHJc/src/compiler/chainrules.jl:211

Additionally if defined like so:

function f(t, x)
    sum(evalpoly.(x, (t,)))
end

julia> gradient(f, rand(5), rand(5))
([5.0, 2.713017060264003, 1.5857856544224804, 0.9930874986731469, 0.660749431626682], [0.6343429588463574, 0.9682506289354968, 1.6624374807955673, 1.0891653882116072, 0.9456849442872972])

julia> using CUDA
julia> f(CUDA.rand(5), CUDA.rand(5))
4.7828465f0

julia> gradient(f, rand(5), rand(5))
([5.0, 3.249928710862126, 2.174744587473789, 1.4846015761873623, 1.026958124254971], [1.21979017866533, 1.4078727321217683, 0.4826223482672483, 1.3151328356715242, 1.3661971907888482])

julia> gradient(f, CUDA.rand(5), CUDA.rand(5))
ERROR: CuArray only supports element types that are allocated inline.
Real is not allocated inline

Not sure where to report the bug with cuda

mcabbott commented 4 months ago

Perhaps worth noting that you get a different error if the coefficients are an argument, so that vect and vect_pullback are not involved:

julia> using Zygote

julia> gradient((x,c) -> sum(evalpoly.(x, Ref(c))), [2.0], [1,2,3])
([14.0], [1.0, 2.0, 4.0])

julia> gradient((x,c) -> sum(evalpoly.(x, Ref(c))), [2.0], [1,2,3.5])
ERROR: DimensionMismatch: array with ndims(x) == 1 >  0 cannot have dx::Number
Stacktrace:
 [1] (::ChainRulesCore.ProjectTo{AbstractArray, @NamedTuple{…}})(dx::Float64)
   @ ChainRulesCore ~/.julia/packages/ChainRulesCore/6DiyF/src/projection.jl:255
 [2] _project
   @ ~/.julia/packages/Zygote/jxHJc/src/compiler/chainrules.jl:189 [inlined]
 [3] map
   @ ./tuple.jl:383 [inlined]
 [4] _project_all(x::Tuple{Vector{Float64}, Vector{Float64}}, dx::Tuple{Vector{Float64}, Float64})
   @ Zygote ~/.julia/packages/Zygote/jxHJc/src/compiler/interface.jl:118
 [5] gradient(::Function, ::Vector{Float64}, ::Vararg{Any})
   @ Zygote ~/.julia/packages/Zygote/jxHJc/src/compiler/interface.jl:149
 [6] top-level scope
   @ REPL[334]:1
Some type information was truncated. Use `show(err)` to see complete types.

julia> Zygote.pullback((x,c) -> sum(evalpoly.(x, Ref(c))), [2.0], [1,2,3.5])  # this avoids projection of final answer
(19.0, Zygote.var"#75#76"{Zygote.Pullback...

julia> ans[2](1.0)
([16.0], 0.0)