FluxML / Tracker.jl

Flux's ex AD
Other
51 stars 37 forks source link

* Methods for LinearAlgebra.Adjoint and Tracked Arrays are Ambiguous #12

Open jessebett opened 6 years ago

jessebett commented 6 years ago
using Flux
p = param(2.)
f(x) = x.^2 + p.^2

[1.,1.,1.]' * f([1.,1.,1.])
# MethodError: *(::LinearAlgebra.Adjoint{Float64,Array{Float64,1}}, ::TrackedArray{…,Array{Float64,1}}) is ambiguous
JoshChristie commented 6 years ago

Might need to broadcast so it would be 1.,1.,1.]' .* f([1.,1.,1.]). Otherwise, does this change help?

jessebett commented 6 years ago

@JoshChristie that is a different operation:

f([1.,1.,1.]) # Tracked 3-array
Flux.data(f([1.,1.,1.])) # 3-array
[1.,1.,1.]'*Flux.data(f([1.,1.,1.])) # Float
[1.,1.,1.]'.*Flux.data(f([1.,1.,1.])) # 3x3 Array
cbeny commented 5 years ago

This yields the same error:

using Flux
ones(2,2)' * param(ones(2))

whereas the following works

collect(ones(2,2)') * param(ones(2))
kylejbrown17 commented 5 years ago

I encountered a similar error:

using Flux, LinearAlgebra

A = ones(2)
B = TrackedArray(ones(2))

transpose(A)*B # throws weird error
ERROR: MethodError: *(::Transpose{Float64,Array{Float64,1}}, ::TrackedArray{…,Array{Float64,2}}) is ambiguous. Candidates:
  *(x::AbstractArray{T,2} where T, y::TrackedArray{T,2,A} where A where T) in Flux.Tracker at /home/user/.julia/packages/Flux/jsf3Y/src/tracker/array.jl:319
  *(x::Transpose{T,#s571} where #s571<:(AbstractArray{T,1} where T) where T, A::AbstractArray{T,2} where T) in LinearAlgebra at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.0/LinearAlgebra/src/matmul.jl:123
Possible fix, define
  *(::Transpose{T,#s571} where #s571<:(AbstractArray{T,1} where T) where T, ::TrackedArray{T,2,A} where A where T)
jessebett commented 5 years ago

This is still a problem @MikeInnes:

using Flux

x = param(randn(3))
v= [1., 0., 0.]

v'*x # Method Error
collect(v')*x # Correct. 

Flux still isn't workign well with LinearAlgebra.Adjoint.

DoktorMike commented 5 years ago

I think I have the same issue the code I'm running is this:

using Flux.Tracker
t(μ, logσ) = μ .+ exp.(logσ) .* randn(length(μ))
m(x, θ) = x'*θ[1:2] .+ θ[3]
λ = (param([0, 0, 0]), param([-1, -1, -1]))
x = [0 0 1 1
     0 1 0 1]
y = [0 1 1 1]
ŷ = m(x, t(λ[1], λ[2]))

results in

ERROR: MethodError: *(::LinearAlgebra.Adjoint{Int64,Array{Int64,2}}, ::TrackedArray{…,Array{Float64,1}}) is ambiguous. Candidates:
  *(x::AbstractArray{T,2} where T, y::TrackedArray{T,1,A} where A where T) in Flux.Tracker at /home/michael/.julia/packages/Flux/8XpDt/src/tracker/lib/array.jl:354
  *(adjA::LinearAlgebra.Adjoint{#s623,#s622} where #s622<:AbstractArray{T,2} where #s623, x::AbstractArray{S,1}) where {T, S} in LinearAlgebra at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/LinearAlgebra/src/matmul.jl:103
Possible fix, define
  *(::LinearAlgebra.Adjoint{#s623,#s622} where #s622<:AbstractArray{T,2} where #s623, ::TrackedArray{S,1,A} where A)
Stacktrace:
 [1] m(::Array{Int64,2}, ::TrackedArray{…,Array{Float64,1}}) at ./REPL[108]:1
 [2] top-level scope at none:0
kmhaeren commented 5 years ago

I'm getting the same error, is there some sort of fix for this issue?

cbeny commented 5 years ago

For now I'm just using collect() when possible, or converting my matrices using TrackedArray() if they are to be passed to a function that will transpose them.

torfjelde commented 5 years ago

EDIT: Seems to be fixed already #37!

Cross-posting from https://github.com/FluxML/Flux.jl/issues/743#issuecomment-527858241 as I didn't realize the issue had been moved over here:

I've attempted a fix, though I'm not too familiar with Tracker yet:

using Tracker, LinearAlgebra

@Tracker.grad function Base.:*(A::Transpose{T, <:AbstractArray{T, 2}}, B::TrackedArray{T, 1}) where {T<:Real}
    return A * Tracker.data(B), Δ -> (nothing, A' * Δ)
end

function Base.:*(A::Transpose{T, <:AbstractArray{T, 2}}, B::TrackedArray{T, 1}) where {T<:Real}
    return Tracker.track(*, A, B)
end

A quick test to ensure I didn't brainfart:

using Random
Random.seed!(1)
A_ = transpose(randn(5, 5))
B_ = randn(5)

A = A_
B = param(B_)
Tracker.back!(sum(A * B), 1.0)
Δ_1 = Tracker.grad(B)

A = param(A_)
B = param(B_)
Tracker.back!(sum(A * B), 1.0)
Δ_2 = Tracker.grad(B)

Δ_1 == Δ_2  # => true

And the example in question:

julia> W = randn(2);

julia> x = randn(2, 7);

julia> transpose(x) * W  # works fine
7-element Array{Float64,1}:
 -0.380285723607731 
  0.6477072367113623
 -0.6557939840691311
  1.2304523024481149
 -0.6836381715827047
 -0.3471918618075099
  0.9095337791636993

julia> W = param(W);

julia> transpose(x) * W  # now also works fine
Tracked 7-element Array{Float64,1}:
 -0.380285723607731 
  0.6477072367113623
 -0.6557939840691311
  1.2304523024481149
 -0.6836381715827047
 -0.3471918618075099
  0.9095337791636993

I can make a PR if this approach seems good. We'll probably have to do the same for re-ordering of the arguments and for Adjoint as I think it faces the same issue.