bug in gradient with broadcast

denizyuret / AutoGrad.jl

Julia port of the Python autograd package.

Other

169 stars 26 forks source link

bug in gradient with broadcast #103

Closed CarloLucibello closed 5 years ago

CarloLucibello commented 5 years ago

I was trying to port some of my knet code to julia 1.0, and incurred in the following very serious bug/ Here is a MWE:

julia> grad(x -> sum(exp.(x)))(1.)
2.718281828459045

julia> grad(x -> sum(exp.(x)))([1.])
1-element Array{Float64,1}:
 2.718281828459045

# Everything fine till now.
# The problem is with user defined functions

julia> f(x)=exp(x)
f (generic function with 1 method)

# this is fine
julia> grad(x -> f(x))(1.)
2.718281828459045

# THIS RETURNS NOTHING!!!
julia> grad(x -> sum(f.(x)))([1.])

CarloLucibello commented 5 years ago

I see this issue has also been reported in #101. This is big impairment for me, because I use a lot of complicated custom operators in my research code. @denizyuret do you have a solution for this or any hint on the problem (what's wrong in the broadcasting pipeline)?

denizyuret commented 5 years ago

The problem is I haven't figured out how to define gradients for arbitrary broadcast operations for AutoGrad or Knet. For primitives (e.g. sin/cos), I define both the scalar and broadcasted gradients with the @primitive macro in AutoGrad and call specific kernels for KnetArray. @ekinakyurek thinks this used to work in Julia 0.6 but we need to figure out how to make it work in Julia 1.0.

denizyuret commented 5 years ago

Solved in latest master, submitted in v1.1.2.