pybind / pybind11

Seamless operability between C++11 and Python
https://pybind11.readthedocs.io/
Other
15.55k stars 2.09k forks source link

Not clear how to expose existing C++ vector as numpy array #1042

Open yesint opened 7 years ago

yesint commented 7 years ago

This is a question of documentation rather than an issue. I can't find any example of the following very common scenario:

std::vector<int> some_func();
...
// We want to expose returned std::vector as a numpy array without copying
m.def("some_func", []() -> py::array {
   auto data = some_func();
   // What to do with data?? Map it with Eigen (then return what?), wrap somehow with py::buffer (how?)
})

I don't know the answer. It would be very nice to have this explained in docs since this scenario if rather common.

YannickJadoul commented 7 years ago

py::array will automatically copy your data if you don't give it a base argument in the constructor (though maybe that's indeed not very well documented).

If you don't want to copy, one solution would be to move the std::vector into a py::capsule and use that capsule as the base for a new py::array, then just continue using the v.data() of that moved vector to construct. If I'm not mistaken, the returned py::array will then keep that capsule alive and delete the vector once capsule can be and is garbage collected.

YannickJadoul commented 7 years ago

Untested code, but this should be the implementation of the non-copy approach:

auto v = new std::vector<int>(some_func());
auto capsule = py::capsule(v, [](void *v) { delete reinterpret_cast<std::vector<int>*>(v); });
return py::array(v->size(), v->data(), capsule);

Yes, probably more black magic than you might expect. But then again, you're not doing something simple either. You are keeping a C++ object alive to make sure you can access its internal data safely, without leaking the memory.

But if you don't mind the copy, just go:

auto v = some_func();
return py::array(v.size(), v.data());
yesint commented 7 years ago

@YannickJadoul, thank you very it really works! I don't mind doing black magic (and the magic is in fact quite logical) but currently the user is not even aware that this kind of magic exists. Are there any plans to document the usage of py::array and py::capsule? The constructors of these types are non-trivial and usage of base argument is, well, a bit arcane.

yesint commented 7 years ago

Another suggestion. Probably it makes sense to provide an easy non-copying conversion from any contiguous buffer to py::arrray? Something like:

auto v = new std::vector<int>(some_func());
py::array array_from_buffer<int>(v, int ndim, shape, strides);

which will create corresponding py::buffer_info and capsule internally? Could be a great addition in the cases when numerical data have to be returned, especially if one needs to wrap the function like:

void some_func(vector<int>& val1, vector<vector<float>>& val2);

Manual wrapping of each argument with py::buffer, py::capsule into the py::array becomes tedious in such cases.

YannickJadoul commented 7 years ago

but currently the user is not even aware that this kind of magic exists.

Agreed, I had to look into the actual headers to check the exact constructors, etc. But I don't know about planned documentation updates. If you feel like it, I'm sure a PR with more documentation on this would be gladly accepted ;-) Then again, I'm not always sure what's stable API and what're implementation details.

YannickJadoul commented 7 years ago

Probably it makes sense to provide an easy non-copying conversion from any contiguous buffer to py::arrray?

Not sure how easy that is to do (and how much more confusing this will make the whole situation). Maybe some kind of a static function as 'named constructor' could make sense, though?

By the way, std::vector<std::vector<int>> is not a contiguous structure. And I don't think this technique works when (un)wrapping the arguments of a function. What I just described was a way of not copying a return std::vector.

yesint commented 7 years ago

Sure, it won't work with "input" function parameters but works for "output" when one transforms c++ signature to python function returning tuple of numpy arrays instead of bunch of ref parameters (tht's exactly my case). In any case such thing should not be automatic - the user have to make it explicit in lambda.

Vector is indeed not contigous, sorry. But for example vector is contigous and could be returned efficiently as 2d array. Such funny structures are common when dealing with variable number of space points when Eigen::MatrixXf is not usable due to unknown dimension.

xkunglu commented 5 years ago

@YannickJadoul your code works, thanks for the reference, I just wanted to point out that you are missing a parenthesis at the end of the line auto cap = py::capsule(v, [](void *v) { delete reinterpret_cast<std::vector<int>*>(v); });

arquolo commented 5 years ago

By the way, anybody knows, how to get py::array_t from std::shared_ptr<std::vector<T>> without copy (and using new/delete)? I tried this:

std::shared_ptr<std::vector<float>> ptr = get_data();
return py::array_t<float>{
    ptr->size(),
    ptr->data(),
    py::capsule(ptr.get(), [](void* p){ reinterpret_cast<decltype(ptr)*>(p)->reset(); }),
};

Obviously, this will never work, because when return happens, ptr will be deallocated from stack. Using capture also does not help, because py::capsule can't accept them:

std::shared_ptr<std::vector<float>> ptr = get_data();
return py::array_t<float>{
    ptr->size(),
    ptr->data(),
    py::capsule([ptr](){ }),  // using lambda-capture to increase lifetime of ptr
};

Worked this solution (which seems very dirty):

std::shared_ptr<std::vector<float>> ptr = get_data();
return py::array_t<float>{
    ptr->size(),
    ptr->data(),
    py::capsule(
        new auto(ptr),  // <- can leak
        [](void* p){ delete reinterpret_cast<decltype(ptr)*>(p); }
    )
};
YannickJadoul commented 5 years ago

@arquolo Indeed, the only data that can be stored in a py::capsule is a single void * and a simple function pointer (this is a Python C API thing, by the way; pybind11 just made a C++ wrapper around it). So if you want the capsule to be a (co-)owner of the shared_ptr, I would think that the last solution is the only one that works and stores the actual shared_ptr object.

Is it that dirty, though? In the end, a capsule taking a std::function (or any kind of lambda/functor object) would incur this same allocation (inside of the std::function) because of the variable size of the capture.

The one thing to note, though, is that the object doesn't need to be a capsule. I can just as well be any other object (though hopefully one that keeps the data alive), so if your shared_ptr would be stored as member in a C++ class that is exposed to Python, you could also take use that py::object.

ferdonline commented 5 years ago

We define the following utility functions, which have proven to be live savers :)

template <typename Sequence>
inline py::array_t<typename Sequence::value_type> as_pyarray(Sequence&& seq) {
    // Move entire object to heap (Ensure is moveable!). Memory handled via Python capsule
    Sequence* seq_ptr = new Sequence(std::move(seq));
    auto capsule = py::capsule(seq_ptr, [](void* p) { delete reinterpret_cast<Sequence*>(p); });
    return py::array(seq_ptr->size(),  // shape of array
                     seq_ptr->data(),  // c-style contiguous strides for Sequence
                     capsule           // numpy array references this parent
    );
}

and the copy version

template <typename Sequence>
inline py::array_t<typename Sequence::value_type> to_pyarray(const Sequence& seq) {
    return py::array(seq.size(), seq.data());
}
arquolo commented 5 years ago

Thanks @ferdonline However, the move-helper needs to change signature to:

template <typename Sequence,
          typename = std::enable_if_t<std::is_rvalue_reference_v<Sequence&&>>>
inline py::array_t<typename Sequence::value_type> as_pyarray(Sequence&& seq)

With such fix, the compiler will warn you if you calls with without std::move

ferdonline commented 4 years ago

With such fix, the compiler will warn you if you calls with without std::move

@arquolo If you call without std::move, it will bind as an L-value reference and then inside it does the std::move anyway. IMHO that's a fine behavior.

YannickJadoul commented 4 years ago

@arquolo If you call without std::move, it will bind as an L-value reference and then inside it does the std::move anyway. IMHO that's a fine behavior.

You will destroy the original container, then, though. That's quite unexpected if you didn't call the container with an rvalue reference. Isn't the standard solution to use std::forward<Sequence>(seq)? In that case you'll copy if you pass an lvalue reference, and you'll move if you get an rvalue or rvalue reference.

ferdonline commented 4 years ago

The function is called as_array and the "docs" say it will move, so I think it's fine, but you choose. It's standard to use std::forward in case you want to pass on the same reference type. Here we don't care, we just want to transform whatever reference type to an rvalue reference.

LeslieGerman commented 4 years ago

By the way, anybody knows, how to get py::array_t from std::shared_ptr<std::vector> without copy (and using new/delete)?

@arquolo , you might be interested in what I have found: https://github.com/pybind/pybind11/issues/323#issuecomment-575717041

YannickJadoul commented 4 years ago

If anyone's interested in a version of @ferdonline's utility function without explicit/manual new and delete:

template <typename Sequence>
inline py::array_t<typename Sequence::value_type> as_pyarray(Sequence &&seq) {
    auto size = seq.size();
    auto data = seq.data();
    std::unique_ptr<Sequence> seq_ptr = std::make_unique<Sequence>(std::move(seq));
    auto capsule = py::capsule(seq_ptr.get(), [](void *p) { std::unique_ptr<Sequence>(reinterpret_cast<Sequence*>(p)); });
    seq_ptr.release();
    return py::array(size, data, capsule);
}

Apart from avoiding new and delete, this also does not leak if for some reason py::capsule would throw.

sharpe5 commented 4 years ago

@YannickJadoul

template <typename Sequence>
inline py::array_t<typename Sequence::value_type> as_pyarray(Sequence &&seq) {
    auto size = seq.size();
    auto data = seq.data();
    std::unique_ptr<Sequence> seq_ptr = std::make_unique<Sequence>(std::move(seq));
    auto capsule = py::capsule(seq_ptr.get(), [](void *p) { std::unique_ptr<Sequence>(reinterpret_cast<Sequence*>(p)); });
    seq_ptr.release();
    return py::array(size, data, capsule);
}

Apart from avoiding new and delete, this also does not leak if for some reason py::capsule would throw.

I'm not sure this would work?

The memory would be freed early as there is nothing left to hold onto the heap allocation after the unique_ptr goes out of scope.

Then another heap allocation could grab the same memory, and new writes could corrupt what is already there (i.e. the numpy buffer we just returned). See https://www.cplusplus.com/reference/memory/unique_ptr/get/.

sharpe5 commented 4 years ago

@YannickJadoul This is what I am using:

/**
 * \brief Returns py:array<T> from vector<T>. Efficient as zero-copy.
 * - Uses std::move to obtain ownership of said vector and transfer everything to the heap.
 * - Only accepts parameter using std::move(...), or else the vector metadata on the stack will go out of scope (heap data will always be fine).
 * \tparam T Type.
 * \param passthrough Numpy array.
 * \return py::array_t<T> with a clean and safe reference to contents of Numpy array.
 */
template<typename T>
inline py::array_t<T> toPyArray(std::vector<T>&& passthrough)
{
    // Pass result back to Python.
    // Ref: https://stackoverflow.com/questions/54876346/pybind11-and-stdvector-how-to-free-data-using-capsules
    auto* transferToHeapGetRawPtr = new std::vector<T>(std::move(passthrough));
    // At this point, transferToHeapGetRawPtr is a raw pointer to an object on the heap. No unique_ptr or shared_ptr, it will have to be freed with delete to avoid a memory leak.

    // Alternate implementation: use a shared_ptr or unique_ptr, but this appears to be more difficult to reason about as a raw pointer (void *) is involved - how does C++ know which destructor to call?

    const py::capsule freeWhenDone(transferToHeapGetRawPtr, [](void *toFree) {              
        delete static_cast<std::vector<T> *>(toFree);
        //fmt::print("Free memory."); // Within Python, clear memory to check free: sys.modules[__name__].__dict__.clear()
    });

    auto passthroughNumpy = py::array_t<T>(/*shape=*/{transferToHeapGetRawPtr->size()}, /*strides=*/{sizeof(T)}, /*ptr=*/transferToHeapGetRawPtr->data(), freeWhenDone);
    return passthroughNumpy;    
}
YannickJadoul commented 4 years ago

@sharpe5

The memory would be freed early as there is nothing left to hold onto the heap allocation after the unique_ptr goes out of scope.

That's why you call seq_ptr.release(), to release ownership of the pointer, right? (but only after you're certain the creation of the py::capsule worked) See https://en.cppreference.com/w/cpp/memory/unique_ptr/release

@YannickJadoul This is what I am using:

This seems quite similar (or the same?) to @ferdonline's utility function. As far as I can see, it will still leak memory when py::capsule throws, because there's nothing holding on to that raw pointer? But yes, it probably won't, and if it throws, something else is probably wrong, so it's fine enough to use. Also, it uses raw new/delete, which is what I tried and managed to avoid with my fragment.

sharpe5 commented 4 years ago

@YannickJadoul You are right, your code is absolutely correct.

I can't help but think that the content of the capsule function is just a very complicated way of calling delete. I greatly prefer modern C++ and smart pointers, but if there is (void *) in the middle it becomes more difficult to reason about the data flow (for me at least!). Either smart pointers up and down the entire stack, or not at all? It is tricky to choose the right level of abstraction, and sometimes if one abstracts too much the intent gets obscured.

I did not see @ferdonline's utility function initially (see above), the one I quoted was written from first principles. It's somewhat interesting that they are virtually identical :)

YannickJadoul commented 4 years ago

I can't help but think that the content of the capsule function is just a very complicated way of calling delete.

Yes, it definitely is, but it does have the advantage of covering the corner case of exceptions in py::capsule's constructor and applying the good practice of avoiding new and delete. I don't think it's that much more complicated,, so I just threw out that addition, if people want to use it. But do of course use what is most comfortable to you.

bstaletic commented 4 years ago

This issue has been resolved. @YannickJadoul has done a great job answering questions here. Further question are better suited for gitter.

YannickJadoul commented 4 years ago

I'm thinking. Maybe we can/should add a convenience function for this to pybind11, since it seems to be such a popular issue. I'll reopen to remind ourselves.

virtuald commented 4 years ago

This seems to be a good place to use a memoryview for holding onto the buffer instead of a capsule? #2307 is useful for invalidating the buffer once it has been released.

virtuald commented 4 years ago

Actually, I think I misunderstood the problem, never mind. A memoryview might be useful in some of these cases however.

sharpe5 commented 4 years ago

For the record, I have a large Python module that has zero-copy communication between Python and C++ when working with columns in a DataFrame. It is zero-copy both ways, i.e. Python >> C++ and C++ >> Python.

It is blazingly fast.

I usually combine it with OpenMP or TBB to do multi-threaded calculations on the column data.

It is all in pybind11 and Modern C++ (except for one raw pointer reference which is wrapped in a function; see above). It's easily testable, when the function is called from C++ is accepts a templated vector, and when it is called from Python it accepts a templated span.

The zero-copy C++ >> Python adapter is in my post above.

This is the zero-copy Python >> C++ adapter:

/**
 * \brief Returns span<T> from py:array_T<T>. Efficient as zero-copy.
 * \tparam T Type.
 * \param passthrough Numpy array.
 * \return Span<T> that with a clean and safe reference to contents of Numpy array.
 */
template<class T=float32_t>
inline std::span<T> toSpan(const py::array_t<T>& passthrough)
{
    py::buffer_info passthroughBuf = passthrough.request();
    if (passthroughBuf.ndim != 1) {
        throw std::runtime_error("Error. Number of dimensions must be one");
    }
    size_t length = passthroughBuf.shape[0];
    T* passthroughPtr = static_cast<T*>(passthroughBuf.ptr);
    std::span<T> passthroughSpan(passthroughPtr, length);
    return passthroughSpan;
}
ghost commented 4 years ago

Hi, I would like to check whether the cleanup function is really called, so wrote the following code.

auto v = new std::vector<int>(some_func());
auto capsule = py::capsule(v, [](void *v) { 
    py::scoped_ostream_redirect output;
    std::cout << "deleting int vector\n";
    delete reinterpret_cast<std::vector<int>*>(v); 
});
return py::array(v->size(), v->data(), capsule);

However, "deleting int vector" is not printed out when I run a python script. I even add the following python code at the end of the python script, but there was no use.

import gc
gc.collect(2)
gc.collect(1)
gc.collect(0)

Could you help me to make the cleanup function called explicitly?

Thank you

sharpe5 commented 4 years ago

@tlsdmstn56-2 You need to delete the variable returned by the pybind11 module on the Python side, or else the memory will not be freed. py::array returns a zero-copy reference to the data, so the memory will be held on the C++ side until it is no longer needed on the Python side.

del my_variable
cchriste commented 3 years ago

@sharpe5

For the record, I have a large Python module that has zero-copy communication between Python and C++ when working with columns in a DataFrame. It is zero-copy both ways, i.e. Python >> C++ and C++ >> Python.

It is blazingly fast.

I usually combine it with OpenMP or TBB to do multi-threaded calculations on the column data.

It is all in pybind11 and Modern C++ (except for one raw pointer reference which is wrapped in a function; see above). It's easily testable, when the function is called from C++ is accepts a templated vector, and when it is called from Python it accepts a templated span.

The zero-copy C++ >> Python adapter is in my post above.

This is the zero-copy Python >> C++ adapter:

/**
 * \brief Returns span<T> from py:array_T<T>. Efficient as zero-copy.
 * \tparam T Type.
 * \param passthrough Numpy array.
 * \return Span<T> that with a clean and safe reference to contents of Numpy array.
 */
template<class T=float32_t>
inline std::span<T> toSpan(const py::array_t<T>& passthrough)
{
  py::buffer_info passthroughBuf = passthrough.request();
  if (passthroughBuf.ndim != 1) {
      throw std::runtime_error("Error. Number of dimensions must be one");
  }
  size_t length = passthroughBuf.shape[0];
  T* passthroughPtr = static_cast<T*>(passthroughBuf.ptr);
  std::span<T> passthroughSpan(passthroughPtr, length);
  return passthroughSpan;
}

This is great for sharing the raw data, but how does it handle ownership? It looks like the short answer is that it doesn't, but maybe I'm missing something. Thanks!

sharpe5 commented 3 years ago

@cchriste mentioned:

This is great for sharing the raw data, but how does it handle ownership? It looks like the short answer is that it doesn't, but maybe I'm missing something. Thanks!

Short answer: it doesn't, but that's fine as the parent Python function caller holds ownership for the duration of the call.

Remember, this is the "zero-copy Python >> C++ adapter", so Python creates the Numpy array, C++ modifies the array contents, then returns.

Here is an example scenario:

This is really useful when modifying columns in a DataFrame.

It would be possible to break this if we really wanted to. The C++ side could create another thread, and that thread could start modifying the array behind Python's back, even after the original function call had returned and the Python side had deallocated it. But we assume that once the C++ function returns it does not touch that array again.

On Tue, 1 Jun 2021, 23:20 Cameron Christensen, @.***> wrote:

For the record, I have a large Python module that has zero-copy communication between Python and C++ when working with columns in a DataFrame. It is zero-copy both ways, i.e. Python >> C++ and C++ >> Python.

It is blazingly fast.

I usually combine it with OpenMP or TBB to do multi-threaded calculations on the column data.

It is all in pybind11 and Modern C++ (except for one raw pointer reference which is wrapped in a function; see above). It's easily testable, when the function is called from C++ is accepts a templated vector, and when it is called from Python it accepts a templated span.

The zero-copy C++ >> Python adapter is in my post above.

This is the zero-copy Python >> C++ adapter:

/**

  • \brief Returns span from py:array_T. Efficient as zero-copy.
  • \tparam T Type.
  • \param passthrough Numpy array.
  • \return Span that with a clean and safe reference to contents of Numpy array. / template inline std::span toSpan(const py::array_t& passthrough) { py::buffer_info passthroughBuf = passthrough.request(); if (passthroughBuf.ndim != 1) { throw std::runtime_error("Error. Number of dimensions must be one"); } size_t length = passthroughBuf.shape[0]; T passthroughPtr = static_cast<T*>(passthroughBuf.ptr); std::span passthroughSpan(passthroughPtr, length); return passthroughSpan; }

This is great for sharing the raw data, but how does it handle ownership? It looks like the short answer is that it doesn't, but maybe I'm missing something. Thanks!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pybind/pybind11/issues/1042#issuecomment-852515991, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ3FJHLSPLXCIF4OG3VKCDTQVMHHANCNFSM4DY367GQ .

cchriste commented 3 years ago

@cchriste mentioned: This is great for sharing the raw data, but how does it handle ownership? It looks like the short answer is that it doesn't, but maybe I'm missing something. Thanks! Short answer: it doesn't, but that's fine as the parent Python function caller holds ownership for the duration of the call. Remember, this is the "zero-copy Python >> C++ adapter", so Python creates the Numpy array, C++ modifies the array contents, then returns. Here is an example scenario: Python creates a Numpy array, it is the owner. Python calls a method written in C++/pybind11. The C++ uses the toSpan method above to obtain a reference to this array. The C++ can then safely edit the contents of this array. The C++ returns. The Numpy array is now modified, without the overhead of copying the array's contents back and forth from Python to C++ to Python. This is really useful when modifying columns in a DataFrame. It would be possible to break this if we really wanted to. The C++ side could create another thread, and that thread could start modifying the array behind Python's back, even after the original function call had returned and the Python side had deallocated it. But we assume that once the C++ function returns it does not touch that array again.

I appreciate the quick reply, and agree this is very useful. For our use case, we do in fact want to take ownership of the data.

Going from C++ to Python seems safe: memory buffers are tagged with an ownership flag and, after the last reference to that memory is removed, won't be freed unless owned. Thanks for your other example demonstrating a trick to claim ownership when creating arrays for which pybind11 should simply provide a more straightforward argument.

The other way around does not seem as straightforward. Even if some clever combination of PyObject_GetBuffer/PyObject_Release can be used to ensure Python doesn't delete memory out from under C++, if it's deleted by C++ then any existing Python objects will suddenly be pointing to deallocated space. Maybe if ownership transfer is achieved using a move (a py::array& can be passed to C++, so it's possible to modify the object directly), and only if the reference count is exactly one, the desired goal can be achieved.

sharpe5 commented 3 years ago

@cchriste For Python to C++, I imagine that if the C++ wanted to take ownership of the data, the easiest and safest way would be to make a copy. I imagine that's the only way to prevent Python garbage collecting that data once del variable is executed on the Python side. Get it working first, then optimise it later.

You also mentioned:

if it's deleted by C++ then any existing Python objects will suddenly be pointing to deallocated space

... but the method above exposes the Numpy array as a span which is read-only as far as memory allocation/deallocation goes, and can be range checked, which goes a long way towards making any subsequent C++ code more robust. The span container is actually quite nice like that, see comments on StackOverflow. I'd also recommend putting some comments in the code as insurance against other developers making changes without a clear understanding of the limitations.

roastduck commented 2 years ago

This seems to be a good place to use a memoryview for holding onto the buffer instead of a capsule? #2307 is useful for invalidating the buffer once it has been released.

Actually, I think I misunderstood the problem, never mind. A memoryview might be useful in some of these cases however.

@virtuald I am also encountering this problem. As far as I understand, returning a memoryview means "lend" my memory to a memoryview, while returning an array with a capsule described in this thread means "move" my memory to an array. I would prefer lending (or borrowing), because there is less black magic. I can keep my owner object alive using keep_alive, which is equivalent to "moving", if the owner object is also exposed to PyBind11.

However, a memoryview is not a NumPy object. It dose not support NumPy's arithmetic operations. Can I lend my memory to an array, instead of a memoyview? ~I found some of the array's constructor support a borrowed or stolen parameter, but I did not find any document.~

I have figured it out. I can "lend" my data to an array by passing it a capsule with an empty destructor.

ghost commented 1 year ago

not necessarily a pybind solution, but you could allocate the std::vector on the heap with new, this way it won't get freed until you call delete, given that, it should be safe to use .data() pointer as a pointer for the NumPy array

wuxian08 commented 2 months ago

For the record, I have a large Python module that has zero-copy communication between Python and C++ when working with columns in a DataFrame. It is zero-copy both ways, i.e. Python >> C++ and C++ >> Python.

It is blazingly fast.

I usually combine it with OpenMP or TBB to do multi-threaded calculations on the column data.

It is all in pybind11 and Modern C++ (except for one raw pointer reference which is wrapped in a function; see above). It's easily testable, when the function is called from C++ is accepts a templated vector, and when it is called from Python it accepts a templated span.

The zero-copy C++ >> Python adapter is in my post above.

This is the zero-copy Python >> C++ adapter:

/**
 * \brief Returns span<T> from py:array_T<T>. Efficient as zero-copy.
 * \tparam T Type.
 * \param passthrough Numpy array.
 * \return Span<T> that with a clean and safe reference to contents of Numpy array.
 */
template<class T=float32_t>
inline std::span<T> toSpan(const py::array_t<T>& passthrough)
{
  py::buffer_info passthroughBuf = passthrough.request();
  if (passthroughBuf.ndim != 1) {
      throw std::runtime_error("Error. Number of dimensions must be one");
  }
  size_t length = passthroughBuf.shape[0];
  T* passthroughPtr = static_cast<T*>(passthroughBuf.ptr);
  std::span<T> passthroughSpan(passthroughPtr, length);
  return passthroughSpan;
}

Thanks for sharing the code. One thing to notice is that if T is a struct but not packed (i.e., std::is_class_v<T> && alignof(T) > 1 ), this might lead to core dump on some machines. The reason is that when registering T to numpy dtype, it loses the alignement requirement of the dtype. One can simply check that by the assertion assert(py::dtype::of<T>.attr("alignment") == 1);.

In this case, the alignment of the input buffer passthroughBuf.ptr would be 1, which violates the alignment of T and triggers errors on some platforms.

PierreMarchand20 commented 1 month ago

If anyone's interested in a version of @ferdonline's utility function without explicit/manual new and delete:

template <typename Sequence>
inline py::array_t<typename Sequence::value_type> as_pyarray(Sequence &&seq) {
    auto size = seq.size();
    auto data = seq.data();
    std::unique_ptr<Sequence> seq_ptr = std::make_unique<Sequence>(std::move(seq));
    auto capsule = py::capsule(seq_ptr.get(), [](void *p) { std::unique_ptr<Sequence>(reinterpret_cast<Sequence*>(p)); });
    seq_ptr.release();
    return py::array(size, data, capsule);
}

Apart from avoiding new and delete, this also does not leak if for some reason py::capsule would throw.

I was using this version for a while in a library, but recently I noticed it did not work anymore. It must something related to the compiler because I did not change the pybind11 version I was using (its commit is fixed a git submodule in my library). But the version of @sharpe5 works. The main difference seems to come from the constructor of py::array, so a fix for @YannickJadoul seems to be:

template <typename Sequence>
inline pybind11::array_t<typename Sequence::value_type> as_pyarray(Sequence &&seq) {
    auto size                         = seq.size();
    auto data                         = seq.data();
    std::unique_ptr<Sequence> seq_ptr = std::make_unique<Sequence>(std::move(seq));
    auto capsule = pybind11::capsule(seq_ptr.get(), [](void *p) { std::unique_ptr<Sequence>(reinterpret_cast<Sequence *>(p)); });
    seq_ptr.release();
    return pybind11::array({size}, {sizeof(typename Sequence::value_type)}, data, capsule);
}