denizyuret / AutoGrad.jl

Julia port of the Python autograd package.
Other
169 stars 26 forks source link

broadcast error for integer power #80

Closed CarloLucibello closed 6 years ago

CarloLucibello commented 6 years ago

On master and julia 0.7

julia> grad(x->sum(x.^2))([1,2,3])
┌ Warning: broadcast will default to iterating over its arguments in the future. Wrap arguments of
│ type `x::Rec{Array{Int64,1}}` with `Ref(x)` to ensure they broadcast as "scalar" elements.
│   caller = ip:0x0
└ @ Core :-1
ERROR: DimensionMismatch("Cannot multiply two vectors")
Stacktrace:
 [1] *(::Array{Int64,1}, ::Array{Int64,1}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v0.7/LinearAlgebra/src/deprecated.jl:566
 [2] power_by_squaring(::Array{Int64,1}, ::Int64) at ./intfuncs.jl:192
 [3] ^(::Array{Int64,1}, ::Int64) at ./deprecated.jl:55
 [4] (::getfield(AutoGrad, Symbol("##rfun#7#9")){typeof(^)})(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Rec{Array{Int64,1}}, ::Vararg{Any,N} where N) at /home/carlo/.julia/dev/AutoGrad/src/core.jl:133
 [5] rfun at /home/carlo/.julia/dev/AutoGrad/src/core.jl:130 [inlined]
 [6] ^ at ./none:0 [inlined]
 [7] macro expansion at ./none:0 [inlined]
 [8] literal_pow at ./none:0 [inlined]
 [9] _broadcast_getindex_evalf at ./broadcast.jl:574 [inlined]
 [10] _broadcast_getindex at ./broadcast.jl:547 [inlined]
 [11] getindex at ./broadcast.jl:507 [inlined]
 [12] copy at ./broadcast.jl:734 [inlined]
 [13] materialize at ./broadcast.jl:724 [inlined]
 [14] (::getfield(Main, Symbol("##15#16")))(::Rec{Array{Int64,1}}) at ./REPL[11]:1
 [15] (::getfield(AutoGrad, Symbol("##gradfun#1#2")){getfield(Main, Symbol("##15#16")),Int64})(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Array{Int64,1}) at /home/carlo/.julia/dev/AutoGrad/src/core.jl:95
 [16] (::getfield(AutoGrad, Symbol("#gradfun#3")){getfield(AutoGrad, Symbol("##gradfun#1#2")){getfield(Main, Symbol("##15#16")),Int64}})(::Array{Int64,1}) at /home/carlo/.julia/dev/AutoGrad/src/core.jl:39
 [17] top-level scope at none:0

The temporary workaround would be to use a float exponent

julia> grad(x->sum(x.^2.0))([1,2,3])
3-element Array{Float64,1}:
 2.0
 4.0
 6.0
CarloLucibello commented 6 years ago

In src/base.jl we have

@primitive ^(x1,x2),dy,y  unbroadcast(x1,dxndx(x1,x2,dy))  unbroadcast(x2,dy.*y.*log.(x1))
@primitive ^(x1,x2::Integer),dy,y  unbroadcast(x1,dxndx(x1,x2,dy))  unbroadcast(x2,dy.*y.*log.(x1)) # ambiguity fix
dxndx(x1,x2,dy)=(if x2==0; zero(dy); elseif x2==1; dy; elseif x2==2; 2 .* x1 .* dy; else; dy.*x2.*x1.^(x2 .- 1); end) # optimize common cases
denizyuret commented 6 years ago

In addition to integer powers, matrix powers are also broken, e.g. rand(3,3)^2.0

CarloLucibello commented 6 years ago

In addition to integer powers, matrix powers are also broken, e.g. rand(3,3)^2.0

Matrix powers are working for me

julia> r=rand(2,2)
2×2 Array{Float64,2}:
 0.783298  0.554936
 0.904212  0.51457 

julia> grad(x->sum(x^2.0))(r)
2×2 Array{Float64,2}:
 1.5666   1.10987
 1.80842  1.02914

julia> grad(x->sum(x^2))(r)
2×2 Array{Float64,2}:
 1.5666   1.10987
 1.80842  1.02914
CarloLucibello commented 6 years ago

I see what you mean now, matrix powers are treated as elements wise powers. Continuing the example above we have

julia> grad(x->sum(x.^2))(r)
┌ Warning: broadcast will default to iterating over its arguments in the future. Wrap arguments of
│ type `x::Rec{Array{Float64,2}}` with `Ref(x)` to ensure they broadcast as "scalar" elements.
│   caller = ip:0x0
└ @ Core :-1
2×2 Array{Float64,2}:
 1.5666   1.10987
 1.80842  1.02914

julia> grad(x->sum(x.^2.0))(r)
2×2 Array{Float64,2}:
 1.5666   1.10987
 1.80842  1.02914
CarloLucibello commented 6 years ago

Also non-integer matrix powers are broken. This is a very serious bug.

julia> r^3.1
2×2 Array{Float64,2}:
 1.49742  1.16269
 2.06055  1.60003

julia> r.^3.1
2×2 Array{Float64,2}:
 0.35567   0.136366
 0.803742  0.432063

julia> grad(x->sum(x^3.1))(r)
2×2 Array{Float64,2}:
 1.53898  0.803888
 2.67353  1.7558  

julia> grad(x->sum(x.^3.1))(r)
2×2 Array{Float64,2}:
 1.53898  0.803888
 2.67353  1.7558  
denizyuret commented 6 years ago

Latest master fixes this (matrix powers still missing), please test and close.

julia> grad(x->sum(x.^2))([1,2,3])
grad(x->sum(x.^2))([1,2,3])
3-element Array{Int64,1}:
 2
 4
 6

julia> grad(x->sum(x.^2.0))([1,2,3])
grad(x->sum(x.^2.0))([1,2,3])
3-element Array{Float64,1}:
 2.0
 4.0
 6.0

julia> r = [1. 2.; 3. 4.]
r = [1. 2.; 3. 4.]
2×2 Array{Float64,2}:
 1.0  2.0
 3.0  4.0

julia> grad(x->sum(x^2.0))(r)
grad(x->sum(x^2.0))(r)
ERROR: Derivatives of real matrix powers not defined.

julia> grad(x->sum(x^2))(r)
grad(x->sum(x^2))(r)
ERROR: Derivatives of integer matrix powers not defined.

julia> grad(x->sum(x.^2))(r)
grad(x->sum(x.^2))(r)
2×2 Array{Float64,2}:
 2.0  4.0
 6.0  8.0

julia> grad(x->sum(x.^2))(r)
grad(x->sum(x.^2))(r)
2×2 Array{Float64,2}:
 2.0  4.0
 6.0  8.0

julia> grad(x->sum(x.^2.0))(r)
grad(x->sum(x.^2.0))(r)
2×2 Array{Float64,2}:
 2.0  4.0
 6.0  8.0

julia> r^3.1
r^3.1
2×2 Array{Complex{Float64},2}:
 43.7868-0.0109935im   63.8808+0.00502871im
 95.8212+0.00754306im  139.608-0.0034504im

julia> r.^3.1
r.^3.1
2×2 Array{Float64,2}:
  1.0      8.57419
 30.1353  73.5167

julia> grad(x->sum(x^3.1))(r)
grad(x->sum(x^3.1))(r)
ERROR: Derivatives of real matrix powers not defined.

julia> grad(x->sum(x.^3.1))(r)
grad(x->sum(x.^3.1))(r)
2×2 Array{Float64,2}:
  3.1     13.29
 31.1398  56.9754
CarloLucibello commented 6 years ago

NIce, tanks, this is working for me now. I'll open a different issue for matrix powers