Suzhou-Tongyuan / jnumpy

Writing Python C extensions in Julia within 5 minutes.
MIT License
234 stars 8 forks source link

Check if numpy could automatically set contiguous flag when passing wrong information #56

Open songjhaha opened 2 years ago

songjhaha commented 2 years ago

Actually, we don't always set the right contiguous flag when constructing PyArrayInterface. For example, Vector should be both C and F contiguous when converted to ndarray, but we set the flag is_c_style to false when constructing a DynamicArray, but after casting it to ndarray, numpy seems to reset this flag to true, which is correct. Not sure if numpy could always do this check in all situation, need to check carefully.

a = rand(3)
ad = CPython.DynamicArray(a)
ad.is_c_style # false

pya = py_cast(Py, a)
pya.flags
# Py(  C_CONTIGUOUS : True
#   F_CONTIGUOUS : True
# ...

# another example with subarray
b = @view rand(3,3)[2:3, :]
bd = CPython.DynamicArray(y)
bd.is_c_style # false
bd.is_f_style # false
# reconstruct a DynamicArray with wrong flag
bd_wrong = CPython.DynamicArray(bd.arr, bd.eltype, bd.shape, bd.strides, bd.ptr, bd.ndim, bd.itemsize, bd.typekind, true, true, false, bd.perm)
pyb = py_coerce(Py, bd_wrong)
pyb.flags
# Py(  C_CONTIGUOUS : False
#   F_CONTIGUOUS : False
# ...
songjhaha commented 2 years ago

If it's true that numpy could set contiguous flag automatically, the next questions is could we just wrap every StridedArray without copy?

normalized_x = if x isa StridedArray
    x
else
    collect(x)
end

It seems numpy could work correctly if the data pointer, strides, shape and ndims are correct, so it should support all StridedArray. But let me check the mechanism behind numpy first.

songjhaha commented 2 years ago

In numpy C-API document:

The NPY_ARRAY_ALIGNED, NPY_ARRAY_C_CONTIGUOUS, and NPY_ARRAY_F_CONTIGUOUS flags can actually be determined from the other parameters

https://numpy.org/doc/stable/reference/c-api/types-and-structures.html#c.PyArrayInterface.flags

And numpy could update flag with: https://github.com/numpy/numpy/blob/4a9b7145eb1a0731821eb8912c2474352cadc682/numpy/core/src/multiarray/flagsobject.c#L62

And a simple rule to check contiguous is given by: https://github.com/numpy/numpy/blob/4a9b7145eb1a0731821eb8912c2474352cadc682/numpy/core/src/multiarray/flagsobject.c#L91-L115