JuliaSIMD / StrideArrays.jl

Library supporting the ArrayInterface.jl strided array interface.
MIT License
54 stars 9 forks source link

Error when broadcasting across singleton dimension #75

Open johnbcoughlin opened 1 year ago

johnbcoughlin commented 1 year ago

Using the latest version, I get garbage when broadcasting across a singleton dimension:

pkg> st StrideArrays
...
  [d1fa6d79] StrideArrays v0.1.25
julia> f = @StrideArray rand(3, 3);

julia> v = LinRange(-5, 0, 3) |> collect;

julia> ^C

julia> f .* reshape(v, (1, :))
3×3 StrideArraysCore.StaticStrideArray{Float64, 2, (1, 2), Tuple{StaticInt{3}, StaticInt{3}}, Tuple{Nothing, Nothing}, Tuple{StaticInt{1}, StaticInt{1}}, 9} with indices static(1):static(3)×static(1):static(3):
 -0.416778  -0.925133      0.0
 -0.864525   0.0           6.65141e-315
  0.0        1.05309e-314  6.31801e-314

julia> f .* reshape(v, (1, :)) == Array(f) .* reshape(v, (1, :))
false

Note the numerical garbage in the trailing rows, which seems to come from uninitialized memory. This also occurs when broadcasting in-place; the destination will end up with stuff from uninitialized memory.

I think I have enabled boundschecking, via

StrideArraysCore.boundscheck() = true
chriselrod commented 1 year ago

This is a LoopVectorization broadcasting issue. When contiguous axis are dynamically broadcasted, the behavior is undefined.

That is, when an axis that would normally be contiguous in memory is of size 1 but that this is not known through the type system, while the corresponding axis in other arrays is not of size 1, this results in undefined behavior.

A PR to LV to fix this would be welcome. It should be relatively straight forward, e.g. in vmaterialize!, check if this is the case, and if so fall back to some slow method. You could also generate code for an optimized case where linear indexing works.

You could look to the source of FastBroadcast.jl for ideas.

cortner commented 1 year ago

I've run into this as well. How do I need to restrict the versions of Loopvectorisation and Stridearrays to avoid this?

Or at least I think my problem is the same??

using StrideArrays, LinearAlgebra
_A = zeros(10,10)
A = PtrArray(_A)
A[:, :] .= randn(10,10)
b = randn(10)
d = sum( b[j] * A[1,j] for j = 1:10 )
@show        dot(A[1, :], b) ≈ d   # false
@show dot((@view A[1,:]), b) ≈ d   # false
@show      sum(A[1,:] .* b ) ≈ d   # false
a1 = collect(@view A[1,:])
@show          sum(a1 .* b ) ≈ d   # true
chriselrod commented 1 year ago

Or at least I think my problem is the same??

No, actually. Fixed with LoopVectorization 0.12.163 + StrideArraysCore 0.4.17.

cortner commented 1 year ago

fantastic, thank you!