Closed QuLogic closed 2 weeks ago
I changed the test to return a SimpleNamespace
of the information instead of a pybind11-wrapped struct thing. This way, I can release the buffer and not leak stuff.
I still don't know what's wrong with PyPy and GraalPy; I can't get a proper backtrace out of PyPy, and no idea how to install GraalPy.
I'll have a look at the GraalPy failure
Ok, so failure is on test-buffers.py:112
. What happens is that the struct.unpack_from
call requests a buffer with PyBUF_SIMPLE
, but you give it a buffer that has ndim = 2
, but shape = NULL
. GraalPy internally constructs a memoryview of the buffer and memoryview constructor assumes that if ndim > 1
then shape
is filled.
You'd run into the same problem on CPython if you do PyObject_GetBuffer(obj, &buffer, PyBUF_SIMPLE)
and then do PyMemoryView_FromBuffer(buffer)
. It's going to dereference shape
without checking it for NULL
at https://github.com/python/cpython/blob/main/Objects/memoryobject.c#L572.
In conclusion, setting ndim > 1
without PyBUF_ND
flag is invalid. The doc isn't clear on that, but apparently all 3 implementations have that assumption. I think it makes sense if you think about the semantics. If the caller didn't specify they handle multidimensional buffers, you should't give them one, you should make the buffer one-dimensional. So I think you should keep setting ndim = 1
unless PyBUF_ND
flag is set. memoryview
itself does the same, multidimensional memoryview would return one-dimensional buffer from PyObject_GetBuffer(mv, &buffer, PyBUF_SIMPLE)
or error our if it's not C contiguous.
I have taken a look at this, and read through the docs too. Whilst it clearly states that ndim
must always be set and valid it would seem that the value is incorrectly >1 in certain situations in several implementations. Reading around I also don't see the value in supporting a value of ndim
!= 1 unless it is PyBUF_ND
or higher - what does that mean practically?
It looks like restoring the setting of ndim
to 1 by default and moving the setting otherwise to the PyBUF_ND
would yield the correct behavior even if it is a little at odds with the documentation. Otherwise it would be good to see examples of PyBUF_SIMPLE
with a value other than ndim = 1
.
@msimacek Thanks for the investigation. Reading through your explanation, and the docs again, I think I understand the confusion better now.
From the docs, we have:
The following fields are not influenced by flags and must always be filled in with the correct values: obj, buf, len, itemsize, ndim.
which seems to suggest the values should match the buffer as it exists. So I interpreted the flags
to mean "I only need to know this information", and it seems like (other than ndim
) that's how pybind11_getbuffer
is implemented.
But then looking back at the previous section:
Buffers are usually obtained by sending a buffer request to an exporting object via PyObject_GetBuffer(). Since the complexity of the logical structure of the memory can vary drastically, the consumer uses the flags argument to specify the exact buffer type it can handle.
which indicates that flags
is more of a collaboration, i.e., the caller is saying "please give me this type of buffer, or less, but no more complex than that."
So in that sense, the "request-independent fields" should always be correct, but that term is not specified. I read that to be correct "with regards to the underlying buffer", but it is actually correct "as the buffer is exposed", which are not necessarily the same things. For example, a 2D (M, N)-shaped buffer can be exported as an (M*N) 1D buffer if it's contiguous. But then the "request-independent" in the section title is a bit of a misnomer, as ndim
does depend on the flags.
What I think that means though is that there is a different underlying bug, because pybind11_getbuffer
does not verify that the buffer is contiguous if flags are < PyBUF_STRIDES
, but simply exposes the buffer as-is.
So in that sense, the "request-independent fields" should always be correct, but that term is not specified. I read that to be correct "with regards to the underlying buffer", but it is actually correct "as the buffer is exposed", which are not necessarily the same things. For example, a 2D (M, N)-shaped buffer can be exported as an (M*N) 1D buffer if it's contiguous. But then the "request-independent" in the section title is a bit of a misnomer, as
ndim
does depend on the flags.
If shape, strides and suboffsets are all null how would the shape be represented in such a scenario? Doesn't PyBUF_SIMPLE
imply contiguous with no field to represent shape in this case?
But then the "request-independent" in the section title is a bit of a misnomer, as
ndim
does depend on the flags.
Documentation is usually the least reliable part of a software distribution. It was written by humans that probably don't want to spend their time writing documentation and may have omissions and inconsistencies. I'm looking at documentation as useful to get the big picture, but easily misleading when it comes to details. If that happens, it's important to not take the documentation too literally, and instead focus on what works best for the ecosystem.
In this particular case: PyBUF_SIMPLE
goes with ndim = 1
. I wouldn't look any further than that. Unless someone comes back here later and presents an actual problem.
For now, let's just fix the obvious.
If shape, strides and suboffsets are all null how would the shape be represented in such a scenario?
It won't be, but that's my point: pybind11 should raise an error if the memory block doesn't fit the requirements of the caller, and it doesn't do so right now.
Doesn't
PyBUF_SIMPLE
imply contiguous with no field to represent shape in this case?
It implies contiguity of the memory block, but not necessarily the shape of the buffer itself. If the buffer can be reduced to a contiguous 1D buffer, then it is allowed.
Here's an example from NumPy using struct.unpack_from
(which @msimacek noted above uses PyBUF_SIMPLE
):
>>> a = np.full((3, 5), 30.0) # Allocate 2D buffer at once.
>>> a.flags.c_contiguous
True
>>> a.size
15
>>> a.shape
(3, 5)
>>> a.strides
(40, 8)
so we have a 2D buffer with a C-contiguous block of memory. In this case, struct.unpack_from
works:
>>> struct.unpack_from(f'{a.size}d', a)
(30.0, 30.0, 30.0, 30.0, 30.0, 30.0, 30.0, 30.0, 30.0, 30.0, 30.0, 30.0, 30.0, 30.0, 30.0)
If we then take a slice and make the block discontiguous:
>>> b = a[::2, ::2]
>>> b.size
6
>>> b.shape
(2, 3)
>>> b.strides
(80, 16)
>>> b.flags.c_contiguous
False
>>> b.flags.f_contiguous
False
>>> b.flags.contiguous
False
then struct.unpack_from
no longer works:
>>> struct.unpack_from(f'{b.size}d', b)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: ndarray is not C-contiguous
Unless someone comes back here later and presents an actual problem.
We can replicate something like the above in pybind11:
class Block {
public:
Block(py::size_t height, py::size_t width) : m_height(height), m_width(width) {
// Allocate a buffer that is 10 times bigger in each direction.
m_data.resize(height * 10 * width * 10);
// Just something to see what's found in memory.
auto count = 0;
for (auto &&val : m_data) {
val = count;
count++;
}
}
double *get_buffer() {
return m_data.data();
}
py::size_t height() const { return m_height; }
py::size_t width() const { return m_width; }
private:
py::size_t m_height;
py::size_t m_width;
std::vector<double> m_data;
};
PYBIND11_MODULE(test, m, py::mod_gil_not_used())
{
py::class_<Block>(m, "Block", py::is_final(), py::buffer_protocol())
.def(py::init<py::size_t, py::size_t>())
.def_buffer([](Block &self) -> py::buffer_info {
std::vector<py::size_t> shape { self.height(), self.width() };
std::vector<py::size_t> strides { self.width()*10 * 10*sizeof(double), 10*sizeof(double) };
return py::buffer_info(self.get_buffer(), shape, strides);
});
}
If we ask NumPy, which obeys strides
:
>>> b = Block(3, 5)
>>> np.array(b)
array([[ 0., 10., 20., 30., 40.],
[ 500., 510., 520., 530., 540.],
[1000., 1010., 1020., 1030., 1040.]])
If we ask struct.unpack_from
(which asks for a simple buffer):
>>> struct.unpack_from('15d', b)
(0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0, 14.0)
These values are completely wrong, and pybind11 should raise like NumPy.
Please apply Marcus' suggestions.
(Please do not --amend and do not force push.)
In a follow-on commit please implement what you have in mind, ideally with matching test for numpy behavior.
Get Outlook for iOShttps://aka.ms/o0ukef
From: Elliott Sales de Andrade @.> Sent: Friday, November 1, 2024 8:06:45 PM To: pybind/pybind11 @.> Cc: Ralf Grosse Kunstleve @.>; Comment @.> Subject: Re: [pybind/pybind11] Fix buffer protocol implementation (PR #5407)
If shape, strides and suboffsets are all null how would the shape be represented in such a scenario?
It won't be, but that's my point: pybind11 should raise an error if the memory block doesn't fit the requirements of the caller, and it doesn't do so right now.
Doesn't PyBUF_SIMPLE imply contiguous with no field to represent shape in this case?
It implies contiguity of the memory block, but not necessarily the shape of the buffer itself. If the buffer can be reduced to a contiguous 1D buffer, then it is allowed.
Here's an example from NumPy using struct.unpack_from (which @msimacekhttps://github.com/msimacek noted above uses PyBUF_SIMPLE):
a = np.full((3, 5), 30.0) # Allocate 2D buffer at once. a.flags.c_contiguous True a.size 15 a.shape (3, 5) a.strides (40, 8)
so we have a 2D buffer with a C-contiguous block of memory. In this case, struct.unpack_from works:
struct.unpack_from(f'{a.size}d', a) (30.0, 30.0, 30.0, 30.0, 30.0, 30.0, 30.0, 30.0, 30.0, 30.0, 30.0, 30.0, 30.0, 30.0, 30.0)
If we then take a slice and make the block discontiguous:
b = a[::2, ::2] b.size 6 b.shape (2, 3) b.strides (80, 16) b.flags.c_contiguous False b.flags.f_contiguous False b.flags.contiguous False
then struct.unpack_from no longer works:
struct.unpack_from(f'{b.size}d', b) Traceback (most recent call last): File "
", line 1, in ValueError: ndarray is not C-contiguous
Unless someone comes back here later and presents an actual problem.
We can replicate something like the above in pybind11:
class Block { public: Block(py::size_t height, py::size_t width) : m_height(height), m_width(width) { // Allocate a buffer that is 10 times bigger in each direction. m_data.resize(height 10 width * 10); // Just something to see what's found in memory. auto count = 0; for (auto &&val : m_data) { val = count; count++; } }
double *get_buffer() {
return m_data.data();
}
py::size_t height() const { return m_height; }
py::size_t width() const { return m_width; }
private:
py::size_t m_height;
py::size_t m_width;
std::vector
PYBIND11_MODULE(test, m, py::mod_gil_notused())
{
py::class
If we ask NumPy, which obeys strides:
b = Block(3, 5) np.array(b) array([[ 0., 10., 20., 30., 40.], [ 500., 510., 520., 530., 540.], [1000., 1010., 1020., 1030., 1040.]])
If we ask struct.unpack_from (which asks for a simple buffer):
struct.unpack_from('15d', b) (0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0, 14.0)
These values are completely wrong, and pybind11 should raise like NumPy.
— Reply to this email directly, view it on GitHubhttps://github.com/pybind/pybind11/pull/5407#issuecomment-2452825075, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAFUZAGDTYD6DB56GMKLBH3Z6Q6ULAVCNFSM6AAAAABPYCSMVKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINJSHAZDKMBXGU. You are receiving this because you commented.Message ID: @.***>
I applied the changes separately, but the last commit basically flips the implementation around. Instead of starting minimal and adding more information based on the flags, it starts with all information and pares it back to match the request from the caller. If the underlying buffer cannot be exposed in a simplified form, then an error is raised.
I added tests for C/Fortran/any contiguities, and that the strides mapped into NumPy correctly.
To compare with NumPy, you can swap:
Matrix(5, 4)
with np.empty((5, 4), dtype=np.float32)
FortranMatrix(5, 4)
with np.empty((5, 4), dtype=np.float32, order='F')
DiscontiguousMatrix(5, 4, 2, 3)
with np.empty((5*2, 4*3), dtype=np.float32)[::2, ::3]
Also replace mat.rows()
with 5 and mat.cols()
with 4 (maybe I should do that so we're always comparing with 'known' values instead of internals?)Then running the tests show two differences:
PyBUF_SIMPLE
, NumPy returns ndim=0
, which I guess is truer in spirit, and we could do here too.ValueError
instead of BufferError
, which is maybe a bug on their part, but close enough to the same.This looks great, and I really appreciate you finding examples @QuLogic that illustrate why the simpler solution is not appropriate. I agree with suggestions from @rwgk, this is a great enhancement and it will really help with corner cases I had not initially considered.
Description
According to the buffer protocol,
ndim
is a required field [1], and should always be set correctly. Additionally,shape
should be set if flags includesPyBUF_ND
or higher [2]. The current implementation only set those fields if flags wasPyBUF_STRIDES
.Note, the tests currently leak, since they don't callI changed it to aPyBuffer_Release
, as figuring out how to make a custom deleter work there is not something I've understood. Should I just immediately convert to aSimpleNamespace
or similar instead or wrapping the struct?SimpleNamespace
.[1] https://docs.python.org/3/c-api/buffer.html#request-independent-fields [2] https://docs.python.org/3/c-api/buffer.html#shape-strides-suboffsets
Suggested changelog entry:
PS, conventional commits are only mentioned in the PR template; they should be mentioned in https://github.com/pybind/pybind11/blob/af67e87393b0f867ccffc2702885eea12de063fc/.github/CONTRIBUTING.md so one knows that before committing...