ariovistus / pyd

Interoperability between Python and D
MIT License
157 stars 32 forks source link

Q: when the result of d_to_python_numpy_ndarray is passed to Python, is the array also copied by value? i.e Python receive a copy? #165

Open mw66 opened 1 year ago

mw66 commented 1 year ago

According to the doc:

Numpy arrays implement the [buffer protocol](https://docs.python.org/3/c-api/buffer.html), which PyD can efficiently convert to D arrays.

To convert a D array to a numpy array, use pyd.extra.d_to_python_numpy_ndarray.

My question is: when the result of d_to_python_numpy_ndarray is passed to Python, is the array also copied by value? i.e Python receive a copy?

If yes, is there anyway to avoid the copy?

Thanks.

ariovistus commented 1 year ago

It has been quite a while since I have touched that code, but knee jerk answer is it copies by reference, and a quick close read supports that - it is copying pointers in a rather unreadable way to support arbitrary dimension numpy arrays

mw66 commented 1 year ago

My intention is that: since D code and Python code run in the same memory space, is it possible for them to directly share the same raw pointer to the beginning of an array (let's only consider 1D array of float type) without the need to copy array contents?

ariovistus commented 1 year ago

yes

mw66 commented 1 year ago

So, how? can you show a small example 😀 ?

I tried to pass a D array using d_to_python_numpy_ndarray, and changed the contents on the Python side, but when I print the D side array, it's still the same.

ariovistus commented 1 year ago

uh oh, maybe I'm a liar.. let me try

ariovistus commented 1 year ago

ya, its doing a value copy. dang. So I'm guessing the reason it is that way is because numpy doesn't have a way to be fed a pointer (or I couldn't find one). If you're ok with numpy doing the memory allocations, it is possible for d code to get numpy pointers:

    import std.stdio;
    auto context = new InterpContext();
    context.py_stmts("
        import numpy
        np_array = numpy.ones((10,), dtype='int32')
        print(np_array)
    ");
    PydObject pyd_array = context.np_array;
    void* raw_ptr = pyd_array.buffer_view().item_ptr([0]);
    int* d_ptr = cast(int*) raw_ptr;
    int[] d_array = d_ptr[0 .. 10];
    writeln("d array: ", d_array);
    d_array[5] = 400;
    writeln("d array: ", d_array);

    context.py_stmts("
        print(np_array)
    ");
mw66 commented 1 year ago

Thanks for the example. Yes, in this way it can share the raw pointer on both sides.

mw66 commented 1 year ago

I'm not sure if this is related, but I'm experiencing a problem: a Python func call hangs somewhere in a multi threaded program

(gdb) where                                                                                                                             
#0  0x00007ffff3e08fb9 in futex_reltimed_wait_cancelable (private=<optimized out>, reltime=0x7fffba7ec910, expected=0, futex_word=0x7ffff57d3da8 <_PyRuntime+424>) at ../sysdeps/unix/sysv/linux/futex-internal.h:142
#1  __pthread_cond_wait_common (abstime=0x7fffba7ec9f0, mutex=0x7ffff57d3db0 <_PyRuntime+432>, cond=0x7ffff57d3d80 <_PyRuntime+384>) at pthread_cond_wait.c:533
#2  __pthread_cond_timedwait (cond=0x7ffff57d3d80 <_PyRuntime+384>, mutex=0x7ffff57d3db0 <_PyRuntime+432>, abstime=0x7fffba7ec9f0) at pthread_cond_wait.c:667
#3  0x00007ffff556deed in PyCOND_TIMEDWAIT (us=<optimized out>, mut=<optimized out>, cond=0x7ffff57d3d80 <_PyRuntime+384>) at /tmp/build/80754af9/python-split_1607696593712/work/Python/condvar.h:73
#4  take_gil () at /tmp/build/80754af9/python-split_1607696593712/work/Python/ceval_gil.h:247
#5  0x00007ffff556e042 in PyEval_RestoreThread () at /tmp/build/80754af9/python-split_1607696593712/work/Python/ceval.c:467
#6  0x00007ffff5644421 in PyGILState_Ensure () at /tmp/build/80754af9/python-split_1607696593712/work/Python/pystate.c:1378
#7  0x00007fff63cb139d in tensorflow::(anonymous namespace)::StackTraceWrapper::~StackTraceWrapper() ()

This looks like a Python GIL locking issue.

So before I check other things, I want to ask the following question:

does this approach work in a multi threaded situation: https://github.com/ariovistus/pyd/issues/165#issuecomment-1309807272

Specifically: suppose in the D's main thread, at the init() stage, all these d_array is allocated and wrote some data into it; and later on, in one of the worker thread calls a Python func (via pyd) which access the afore-allocated allocated memory np_array on the Python side, will this causing some Python GIL locking issue?

If yes, is there any pyd func call to release the lock (on d_array, pyd_array, ... all the way upto context.np_array) in the main thread, and let the worker thread go thru?

I found these two functions, but did not see how they are used:

$ grep -Iir release .dub/packages/pyd-0.14.3/ | grep -i lock
.dub/packages/pyd-0.14.3/pyd/infrastructure/deimos/python/ceval.d:void PyEval_ReleaseLock();
.dub/packages/pyd-0.14.3/pyd/infrastructure/deimos/python/pythread.d:void PyThread_release_lock(PyThread_type_lock);

Thanks.

mw66 commented 1 year ago

Just an update: play with Python GIL is no fun. I think the best practice to use PyD / Python is to keep it running in the same single D thread, which makes things easier. Even with this setting, I sometimes run into problems with dynamically loaded libraries by Python.