Closed jakirkham closed 4 months ago
Looks like this code might need some tweaks if we proceed further
Edit: Changes below rewrite this to use a memoryview
instead
I have a question: Did
pybind11/numpy.h
depends on numpy header or numpy library?
A few things to unpack here
NumPy is atypical in its setup. When building against NumPy, one only #include
s the NumPy header. There is not a NumPy library that one links against in the typical sense
However the symbols that the NumPy header names are in the Python shared objects that the NumPy package ships. Those symbols get loaded when calling import numpy
(there is a similar operation that NumPy supplies for use in C APIs). So this is how the symbols get resolved at runtime
Regardless, from a developer's perspective, building against NumPy always means using the header and the libraries. There isn't a way to pick just one or the other
Using pybind11 for NumPy support is not unique in this regard
If this change is about for handling numpy 2, can upgrading pybind11 library to the latest version helps? Looks like pybind11 is handling numpy 2 case - https://github.com/pybind/pybind11/blob/master/include/pybind11/numpy.h#L187
Yes, it is true that pybind11 2.12.0 ships with NumPy 2 support. Building against that would be sufficient for NumPy 1 & 2 support (without other changes)
That said, there are relatively few cases where the NumPy API is strictly needed. Especially after the introduction of the Python Buffer Protocol. Many use cases (and ours in cuCIM is one of these) simply need a way to access the underlying memory buffer of Python objects (NumPy arrays or otherwise). So in these cases, it is better to use the Python Buffer Protocol directly (as this code change does), which works not only with NumPy arrays, but any object that supports the Python Buffer Protocol. As a result this simplifies our dependencies. Plus this approach is more flexible and interoperable with other libraries
I have a question: Did
pybind11/numpy.h
depends on numpy header or numpy library?A few things to unpack here
NumPy is atypical in its setup. When building against NumPy, one only
#include
s the NumPy header. There is not a NumPy library that one links against in the typical senseHowever the symbols that the NumPy header names are in the Python shared objects that the NumPy package ships. Those symbols get loaded when calling
import numpy
(there is a similar operation that NumPy supplies for use in C APIs). So this is how the symbols get resolved at runtimeRegardless, from a developer's perspective, building against NumPy always means using the header and the libraries. There isn't a way to pick just one or the other
Using pybind11 for NumPy support is not unique in this regard
If this change is about for handling numpy 2, can upgrading pybind11 library to the latest version helps? Looks like pybind11 is handling numpy 2 case - https://github.com/pybind/pybind11/blob/master/include/pybind11/numpy.h#L187
Yes, it is true that pybind11 2.12.0 ships with NumPy 2 support. Building against that would be sufficient for NumPy 1 & 2 support (without other changes)
That said, there are relatively few cases where the NumPy API is strictly needed. Especially after the introduction of the Python Buffer Protocol. Many use cases (and ours in cuCIM is one of these) simply need a way to access the underlying memory buffer of Python objects (NumPy arrays or otherwise). So in these cases, it is better to use the Python Buffer Protocol directly (as this code change does), which works not only with NumPy arrays, but any object that supports the Python Buffer Protocol. As a result this simplifies our dependencies. Plus this approach is more flexible and interoperable with other libraries
Thanks @jakirkham for the comprehensive explanation! It makes sense, and thank you for the update! 🙂
/merge
Thanks all! 🙏
Partially addresses issue: https://github.com/rapidsai/build-planning/issues/82 Partially addresses issue: https://github.com/rapidsai/build-planning/issues/41
Even though cuCIM currently
#include
s<pybind11/numpy.h>
, the actual C++ code appears not to use NumPy. So this attempts to drop the header and the NumPy build dependency.