Closed hpkfft closed 3 months ago
I'd like to suggest that nb::ndarray
store strides with finer granularity. Two ideas immediately come to mind:
sizeof(T)
for nb::ndarray<std::complex<T>>
.
There may be some plausible justification for the latter option if staying closer to the current behavior is desired.
z
of type std::complex<T>
, that reinterpret_cast<T(&)[2]>(z)[0]
is the real part of z
and reinterpret_cast<T(&)[2]>(z)[1]
is the imaginary part of z
. Also, for a pointer to an array of complex numbers, the pointer can be reinterpreted as T*
and the resulting real-valued array pointer can be indexed as one would expect, with even indices for the real parts and odd indices for the imaginary. I guess the point is that complex numbers get special treatment.
static_assert(sizeof(uint8_t) == 1);
static_assert(alignof(uint8_t) == 1);
static_assert(sizeof(uint32_t) == 4); static_assert(alignof(uint32_t) == 4);
static_assert(sizeof(float) == 4); static_assert(alignof(float) == 4);
static_assert(sizeof(std::complex
AFAIK this is not possible because nanobind centers around DLPack. The ability for nb::ndarray
to talk to NumPy via the buffer protocol is really just there to support older NumPy versions, since DLPack is still a relatively new feature.
In any case, for DLPack, all of these quantities are relative to the itemsize
. So I think this request is not compatible with the design of the library.
Yes, I see.
>>> import numpy as np
>>> a = np.array([[1, 2, 3, 4, 5, 6, np.NAN],
... [8, 0, 0, 0, 0, 0, np.NAN]], dtype=np.float32)
>>> s = a[:, 0:6] # slice
>>> v = s.view(np.complex64)
>>> v.__dlpack__()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
BufferError: DLPack only supports strides which are a multiple of itemsize.
Would it be good for nanobind to throw an exception in the buffer protocol path if strides are not a multiple of itemsize? This would avoid silent data corruption if the stride (in bytes) cannot be correctly converted to itemsize units.
If you like, I can work on a PR for this. I'm a hardware/assembly/C++ guy who is new to python, so your careful code review and any suggestions you have would be welcome.
Would it be good for nanobind to throw an exception in the buffer protocol path if strides are not a multiple of itemsize? This would avoid silent data corruption if the stride (in bytes) cannot be correctly converted to itemsize units.
Absolutely—if this currently leads to corruption, then it should be fixed. I'm happy to review a PR if you make one.
Closed via #489
Problem description
Numpy allows a new view of an array with the same data, which can cause a reinterpretation of the bytes in memory. In particular, an array with 2N real values per row can be reinterpreted as an array with N complex values per row. [This is useful for FFTs of real data, as the result is complex-valued, and since the Fourier transform is reversible, there is not more "data content" in the output than in the input. So, numpy's views do the right thing.]
The issue arises when the stride from one row to the next is an odd number of real-valued elements. Numpy measures strides in bytes, so a view does not change the stride and it just works. In nanobind, strides are measured in units of
itemsize
, so an nb::ndarray cannot correctly represent a stride that is not an integral multiple of the size of thedtype
.Reproducible example code
Python (interactive session):