JuliaSparse / SparseArrays.jl

SparseArrays.jl is a Julia stdlib
https://sparsearrays.juliasparse.org/
Other
90 stars 52 forks source link

`vcat`/`hcat`/`hvcat`-related method ambiguities #431

Open jishnub opened 1 year ago

jishnub commented 1 year ago

The type-pirated vcat methods introduced here are leading to method ambiguities on Julia v1.10, of the form:

special vcat: Error During Test at /home/jishnu/Dropbox/JuliaPackages/InfiniteArrays.jl/test/runtests.jl:721
  Test threw exception
  Expression: [randn(2, 2); Zeros(∞, 2)] isa Vcat{<:Any, 2, <:Tuple{Array, Zeros}}
  MethodError: vcat(::Matrix{Float64}, ::Zeros{Float64, 2, Tuple{OneToInf{Int64}, Base.OneTo{Int64}}}) is ambiguous.

  Candidates:
    vcat(a::AbstractMatrix, b::FillArrays.AbstractFill{<:Any, 2, <:Tuple{OneToInf, Base.OneTo}})
      @ InfiniteArrays ~/Dropbox/JuliaPackages/InfiniteArrays.jl/src/InfiniteArrays.jl:121
    vcat(X::Union{Number, AbstractVecOrMat{<:Number}}...)
      @ SparseArrays ~/packages/julias/julia-1.10/share/julia/stdlib/v1.10/SparseArrays/src/sparsevector.jl:1229

This test passes on v1.9.3, but it fails on v1.10.0-beta2 because of the new method added here.

prbzrg commented 10 months ago

there are a lot of reported issues about this problem. github search: https://github.com/search?q=vcat+ambiguous+language%3AJulia&type=issues

dlfivefifty commented 9 months ago

It looks like Julia itself needs something like ConcatStyle a la BroadcastStyle.

LazyArrays.jl and InfiniteArrays.jl would really benefit from such a thing too.

aplavin commented 9 months ago

Should this be fixed soon, possibly removing the method/making it more specific not to cause ambiguity?

Sparse arrays themselves aren't used too often, but SparseArrays.jl is a dependency of huge amount of Julia code because Statistics.jl depends on it. The

  vcat(X1::Union{Number, AbstractVecOrMat{<:Number}}, X::Union{Number, AbstractVecOrMat{<:Number}}...)

method makes any

vcat(::MyArray{T}, ::MyArray{T}) where {T}

ambiguous for the case of MyArray{<:Number}. Surely we wouldn't want every array type to define

vcat(::MyArray{<:Number}, ::MyArray{<:Number})

specifically, in addition to eltype-agnostic vcat!

Recent julia repo issues: https://github.com/JuliaLang/julia/issues/52386 https://github.com/JuliaLang/julia/issues/52263

KristofferC commented 9 months ago

Should this be fixed soon, possibly removing the method/making it more specific not to cause ambiguity?

If you can fix this in a good way doing this it would be helpful.

To me, it just seems that the whole vcat overloading story is a complete mess and there is no quick fix for this. I guess that packages will just have to define the extra method, for now, to fix the ambiguity as a short-term solution.

aplavin commented 4 months ago

In case others find this issue and want a simple reliable workaround – here it is:

Base.delete_method.(filter(
    m -> nameof(m.module) == :SparseArrays && m.sig == Tuple{typeof(vcat), Union{Number,AbstractVecOrMat{<:Number}}, Vararg{Union{Number,AbstractVecOrMat{<:Number}}}},
    methods(vcat)
))
Base.delete_method.(filter(
    m -> nameof(m.module) == :SparseArrays && m.sig == Tuple{typeof(hcat), Union{Number,AbstractVecOrMat{<:Number}}, Vararg{Union{Number,AbstractVecOrMat{<:Number}}}},
    methods(hcat)
))

Put this code into your script/code anywhere after package imports. It deletes those pirating methods defined in SparseArrays, getting rid of such ambiguities. At the same time, it doesn't interfere with vcat/hcat on actual sparse arrays – they continue returning sparse arrays:

```julia julia> vcat(sparse([0,1]), sparse([0,1])) 4-element SparseVector{Int64, Int64} with 2 stored entries: [2] = 1 [4] = 1 ```
LilithHafner commented 4 months ago

Concretely,

julia> struct MyArray{N, T} <: AbstractArray{N, T}
           data::Array{N, T}
       end

julia> Base.IndexStyle(::Type{<:MyArray}) = IndexLinear()

julia> Base.setindex(x::MyArray, v, i) = (Base.setindex(x.data, v, i); x)

julia> Base.getindex(x::MyArray, i) = x.data[i]

julia> Base.size(x::MyArray) = size(x.data)

julia> Base.vcat(x::AbstractMatrix, y::MyArray) = MyArray(vcat(x, y.data))

julia> vcat(MyArray(rand(2,2)), MyArray(rand(2,2)))
4×2 MyArray{Float64, 2}:
 0.244229  0.612965
 0.743984  0.46191
 0.321956  0.62292
 0.937136  0.0216649

julia> using SparseArrays

julia> vcat(MyArray(rand(2,2)), MyArray(rand(2,2)))
ERROR: MethodError: vcat(::MyArray{Float64, 2}, ::MyArray{Float64, 2}) is ambiguous.

Candidates:
  vcat(x::AbstractMatrix, y::MyArray)
    @ Main REPL[6]:1
  vcat(X1::Union{Number, AbstractVecOrMat{<:Number}}, X::Union{Number, AbstractVecOrMat{<:Number}}...)
    @ SparseArrays ~/.julia/juliaup/julia-1.10.3+0.aarch64.linux.gnu/share/julia/stdlib/v1.10/SparseArrays/src/sparsevector.jl:1235

Possible fix, define
  vcat(::AbstractMatrix{<:Number}, ::Union{MyArray{<:Number, 1}, MyArray{<:Number, 2}})

Stacktrace:
 [1] top-level scope
   @ REPL[9]:1

However, note that MyArray is more specific than Union{Number, AbstractVecOrMat{<:Number}, so this will not occur if the package only defines vcat(::MyArray, ::MyArray). Additionally, if package A defines vcat(::AbstractArray, ::AArray) and package B defines vcat(::BArray, ::AbstractArray), then there will be ambiguities, so in general one most be ambiguity-aware when defining methods which one does not fully own, unless there is a canonical parameter to dispatch on (single dispatch is easier to disambiguate than multiple dispatch).

julia> Base.morespecific(MyArray, Union{Number, AbstractVecOrMat{<:Number}})
true
aplavin commented 4 months ago

if package A defines vcat(::AbstractArray, ::AArray) and package B defines vcat(::BArray, ::AbstractArray), then there will be ambiguities

That's totally reasonable and expected: in such a scenario, it is indeed not clear how to vcat(BArray, AArray). And one wouldn't really be surprised if SparseArrays defined vcat(SparseArray, AbstractArray). The issue with current SparseArrays vcat definition is that it breaks code that doesn't use any sparse arrays whatsoever.

LilithHafner commented 4 months ago

This effects vcat, hcat, and hvcat, but not cat.

[same intro as above]

julia> Base.hvcat(rows::Tuple{Vararg{Int64}}, x::AbstractMatrix, y::MyArray) = MyArray(hvcat(rows, x, y.data))

julia> hvcat((1,1), MyArray(rand(2,2)), MyArray(rand(2,2)))
4×2 MyArray{Float64, 2}:
 0.989625  0.272548
 0.577782  0.730749
 0.884814  0.436665
 0.202579  0.490402

julia> using SparseArrays

julia> hvcat((1,1), MyArray(rand(2,2)), MyArray(rand(2,2)))
ERROR: MethodError: hvcat(::Tuple{Int64, Int64}, ::MyArray{Float64, 2}, ::MyArray{Float64, 2}) is ambiguous.

Candidates:
  hvcat(rows::Tuple{Vararg{Int64}}, x::AbstractMatrix, y::MyArray)
    @ Main REPL[6]:1
  hvcat(rows::Tuple{Vararg{Int64}}, X1::Union{Number, AbstractVecOrMat{<:Number}}, X::Union{Number, AbstractVecOrMat{<:Number}}...)
    @ SparseArrays ~/.julia/juliaup/julia-1.11.0-beta1+0.aarch64.linux.gnu/share/julia/stdlib/v1.11/SparseArrays/src/sparsevector.jl:1279

Possible fix, define
  hvcat(::Tuple{Vararg{Int64}}, ::AbstractMatrix{<:Number}, ::Union{MyArray{<:Number, 1}, MyArray{<:Number, 2}})

Stacktrace:
 [1] top-level scope
   @ REPL[10]:1
LilithHafner commented 4 months ago

At the same time, it doesn't interfere with vcat/hcat on actual sparse arrays – they continue returning sparse arrays:

It does change the behavior when concatenating non-homogeneous arrays:

julia> using SparseArrays

julia> vcat([0 1; 1 0], sparse([0 1; 1 0]))
4×2 SparseMatrixCSC{Int64, Int64} with 4 stored entries:
 ⋅  1
 1  ⋅
 ⋅  1
 1  ⋅

julia> Base.delete_method.(filter(
           m -> nameof(m.module) == :SparseArrays && m.sig == Tuple{typeof(vcat), Union{Number,AbstractVecOrMat{<:Number}}, Vararg{Union{Number,AbstractVecOrMat{<:Number}}}},
           methods(vcat)
       ))
1-element Vector{Nothing}:
 nothing

julia> vcat([0 1; 1 0], sparse([0 1; 1 0]))
4×2 Matrix{Int64}:
 0  1
 1  0
 0  1
 1  0