Shimuuar / vecvec

0 stars 1 forks source link

Comparison of "slice/stride" implementation in various libraries. #28

Open dmalkr opened 7 months ago

dmalkr commented 7 months ago

Numpy

Numpy uses strides property for each np-array object. Strides are represented by tuples: each tuple element for each dimension. Stride is the byte offset to next item (item size described by dtype). For example, we can easily construct a 5-byte stride for a 4-byte int32 array (and yes, array elements can be un-aligned in Numpy).

References:

BTW, Numpy uses copy-to-temp-array technique to copy (np.copyto()) overlapped arrays. And it doesn't try to do those tricky transpositions like in #5. I suspect this is because the stride may not be a multiple of the element size. Implementation copyto uses array_copyto, which uses PyArray_AssignArray: https://github.com/numpy/numpy/blob/main/numpy/_core/src/multiarray/array_assign_array.c#L305

Julia

Strides in Julia implemented BLAS/LAPACK compatible:

https://docs.julialang.org/en/v1/manual/arrays/#Implementation

An array is "strided" if it is stored in memory with well-defined spacings (strides) between its elements. A strided array with a supported element type may be passed to an external (non-Julia) library like BLAS or LAPACK by simply passing its pointer and the stride for each dimension. The stride(A, d) is the distance between elements along dimension d.

So nothing new here.

And as for #5, Julia simply uses memmove for copying contiguous arrays (https://github.com/JuliaLang/julia/blob/master/base/array.jl#L273), and element-wise copying in the case of strided arrays, views, or other non-contiguous arrays (https://github.com/JuliaLang/julia/blob/master/base/multidimensional.jl#L1154).

So we can show that overwriting elements as described here https://github.com/Shimuuar/vecvec/blob/71219aacf6aed2a844e388cb5e7dad5e46633bd5/vecvec-lapack/Vecvec/LAPACK/Internal/Vector/Mutable.hs#L222-L239 can occurs:

julia> a = [0,1,2,3,4,5,6,7,8,9,10]
julia> src = @view a[4:8]
5-element view(::Vector{Int64}, 4:8) with eltype Int64:
 3
 4
 5
 6
 7
julia> tgt=@view a[2:2:10]
5-element view(::Vector{Int64}, 2:2:10) with eltype Int64:
 1
 3
 5
 7
 9
julia> copyto!(tgt, src)
5-element view(::Vector{Int64}, 2:2:10) with eltype Int64:
 3
 4
 5
 6
 7

julia> a
11-element Vector{Int64}:
  0
  3
  2
  4
  4
  5
  6
  6 <--- must be "7"!
  8
  7
 10

C++, GCC implementation

So I tried copying with slice_array from std. It has the same problem as Julia above, elements copied from left to right, at least in GCC std library (but it seems it may be different implementation in other std libraries). The operator=() from slice_array simply uses __valarray_copy(), which copies elements from left to right: https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/valarray_array.h#L218-L223 . Testing:

#include <valarray>
#include <iostream>

void
print(char const* remark, std::valarray<int> const& v = {})
{
    std::cout << remark;
    if (v.size() != 0)
    {
        std::cout << ':';
        for (int e : v)
            std::cout << ' ' << e;
    }
    std::cout << '\n';
}

int
main() {
    std::valarray<int> vec{0,1,2,3,4,5,6,7,8,9,10};
    print("original", vec);
    size_t sz = 5;
    std::slice_array<int> source = vec[std::slice(3, sz, 1)];
    print("slice (3, 5, 1)", source);
    std::slice_array<int> destination = vec[std::slice(1, sz, 2)];
    print("slice (1, 5, 2)", destination);
    destination = source;
    print("result  ", vec);
    // original                 0 1 2 3 4 5 6 7 8 9 10
    // source                         3 4 5 6 7
    // target                     1   3   5   7   9
    // modified elements        _ 3 _ 4 _ 5 _ 6 _ 7 _
    // expected                 0 3 2 4 4 5 6 6 8 7 10
    std::valarray<int> expected{0,3,2,4,4,5,6,6,8,7,10};
    print("expected", expected);

}

output:

% g++ -std=c++20 -O3 test.cpp -o test && ./test  
original: 0 1 2 3 4 5 6 7 8 9 10
slice (3, 5, 1): 3 4 5 6 7
slice (1, 5, 2): 1 3 5 7 9
result  : 0 3 2 4 4 5 6 6 8 6 10
expected: 0 3 2 4 4 5 6 6 8 7 10
diff here:                  ^
Shimuuar commented 7 months ago

That's quite neat trick. It allows to make array of fields of structs out of array of structs. Consider for example: struct complex double foo; double bar;}. So you can turn of struct to array referencing it's field foo. And that won't be possible with Vec.

It won't work for us however. BLAS API expects stride with stride as multiple of element size. And we build around BLAS and don't have a lot C structs to work with

dmalkr commented 7 months ago

@Shimuuar Some Julia's nuts&bolts extracted :)

dmalkr commented 4 months ago

@Shimuuar Some C++ stuff :)