Open shrealist opened 1 year ago
I looked further into the cause and found the following situation:
root [9] gROOT->GetType("int")
(TDataType *) 0x1d30a8e1320
root [10] gROOT->GetType("const int")
(TDataType *) nullptr
This might be helpful in addressing the issue.
I'm not sure I follow? I'm not seeing SizeOf("const int")
happening?
The reason the code doesn't work is that there's no automatic numpy array -> initializer_list conversion. A tuple or list OTOH works fine.
I'd like to address the issue regarding the use of std::span
The main concern with lists or vectors is the additional overhead they create. They typically involve extra memory copying, which can substantially slow down the operation, especially when dealing with large data sets. In contrast, using std::span effectively leverages the pointer within the NumPy array directly. This results in considerably lower overhead as it avoids unnecessary copying of the data.
The essence of using std::span in this scenario is to optimize performance by minimizing the memory footprint and reducing the computational load. This optimization is crucial in scenarios where efficiency and speed are of paramount importance.
I believe resolving the issue with Cppyy::SizeOf("const int") returning 0 will enable us to utilize std::span effectively, thus achieving the desired performance benefits.
Converting a numpy array to an std::initializer_list
to create a temporary std::span
with both the iteration and the temporary construction done through Python calls is always going to involve memory copies, but it's the calls through Python that will kill performance and as such, I doubt you'll be able to see much of a difference (as in, if using tuples is too slow, then using numpy arrays will be as well).
As said, I'm not seeing the Cppyy::SizeOf("const int")
call occur, so I don't understand how that can solve anything. However, if you want performance, your best bet will be a custom converter for std::span
that understands numpy arrays (and in particular, can access the array's memory if it is flat) and calls the constructor from C++. That's the only way I can see memory copies and calls through Python being avoided.
cppyy does support custom converters, although it's not documented well. Here's an example from the test suite: https://github.com/wlav/cppyy/blob/master/test/test_api.py#L65
invoking the initializer_list converter fails. Below is the reproducible code:
After examining the source code, I found that the issue is due to Cppyy::SizeOf("const int") returning 0 in the followling codes.
Converters.cxx line:3064