Open johnbcoughlin opened 1 year ago
This is a LoopVectorization broadcasting issue. When contiguous axis are dynamically broadcasted, the behavior is undefined.
That is, when an axis that would normally be contiguous in memory is of size 1
but that this is not known through the type system, while the corresponding axis in other arrays is not of size 1
, this results in undefined behavior.
A PR to LV to fix this would be welcome.
It should be relatively straight forward, e.g. in vmaterialize!
, check if this is the case, and if so fall back to some slow method.
You could also generate code for an optimized case where linear indexing works.
You could look to the source of FastBroadcast.jl for ideas.
I've run into this as well. How do I need to restrict the versions of Loopvectorisation and Stridearrays to avoid this?
Or at least I think my problem is the same??
using StrideArrays, LinearAlgebra
_A = zeros(10,10)
A = PtrArray(_A)
A[:, :] .= randn(10,10)
b = randn(10)
d = sum( b[j] * A[1,j] for j = 1:10 )
@show dot(A[1, :], b) ≈ d # false
@show dot((@view A[1,:]), b) ≈ d # false
@show sum(A[1,:] .* b ) ≈ d # false
a1 = collect(@view A[1,:])
@show sum(a1 .* b ) ≈ d # true
Or at least I think my problem is the same??
No, actually. Fixed with LoopVectorization 0.12.163 + StrideArraysCore 0.4.17.
fantastic, thank you!
Using the latest version, I get garbage when broadcasting across a singleton dimension:
Note the numerical garbage in the trailing rows, which seems to come from uninitialized memory. This also occurs when broadcasting in-place; the destination will end up with stuff from uninitialized memory.
I think I have enabled boundschecking, via