bh107 / bohrium

Automatic parallelization of Python/NumPy, C, and C++ codes on Linux and MacOSX
http://www.bh107.org
Apache License 2.0
219 stars 31 forks source link

interop_numpy.get_array wrongly re-uses memory #611

Closed dionhaefner closed 5 years ago

dionhaefner commented 5 years ago

When passing Bohrium arrays to be communicated via MPI, I get a pointer to the data through bh.interop_numpy.get_array. While debugging something else I found this strange behavior:

Script:

from mpi4py import MPI
import bohrium as np

comm = MPI.COMM_WORLD

print(comm.Get_rank())

a = np.random.rand(100, 100)

global_sum = np.empty(1)
np.flush()

comm.Allreduce(
   [np.interop_numpy.get_array(a.sum()), 1, MPI.DOUBLE],
   [np.interop_numpy.get_array(global_sum), 1, MPI.DOUBLE],
   op=MPI.SUM
)

print(global_sum[0])

Output:

$ mpirun -n 2 python mpi_test.py
0
1
Traceback (most recent call last):
  File "mpi_test.py", line 18, in <module>
    op=MPI.SUM
  File "mpi4py/MPI/Comm.pyx", line 714, in mpi4py.MPI.Comm.Allreduce
mpi4py.MPI.Exception: Invalid buffer pointer, error stack:
MPI_Allreduce(953): MPI_Allreduce(sbuf=0x2b12f3834000, rbuf=0x2b12f3834000, count=1, MPI_DOUBLE, MPI_SUM, MPI_COMM_WORLD) failed
MPI_Allreduce(883): Buffers must not be aliased
Traceback (most recent call last):
  File "mpi_test.py", line 18, in <module>
    op=MPI.SUM
  File "mpi4py/MPI/Comm.pyx", line 714, in mpi4py.MPI.Comm.Allreduce
mpi4py.MPI.Exception: Invalid buffer pointer, error stack:
MPI_Allreduce(953): MPI_Allreduce(sbuf=0x2b2f17834000, rbuf=0x2b2f17834000, count=1, MPI_DOUBLE, MPI_SUM, MPI_COMM_WORLD) failed
MPI_Allreduce(883): Buffers must not be aliased

Notice how the send and receive buffers point to the same memory location?

MPI_Allreduce(sbuf=0x2b12f3834000, rbuf=0x2b12f3834000, count=1, MPI_DOUBLE, MPI_SUM, MPI_COMM_WORLD)

If I change

- global_sum = np.empty(1)
+ global_sum = np.ones(1)

I get the expected result:

$ mpirun -n 2 python mpi_test.py
1
0
10002.20226045458
10002.20226045458
madsbk commented 5 years ago

It is because you need to save a reference to the returned Bohrium array of a.sum().

Currently, at line np.interop_numpy.get_array(a.sum()) the Bohrium arrays get out of scope and its data freed by Bohrium. Since global_sum is empty, it is first allocated when you ask for its data at np.interop_numpy.get_array(global_sum). The memory allocation cache will then give it the newly freed data segment.

I'm thinking of a solution where we attach a reference to the Bohrium array on to the NumPy array returned by np.interop_numpy.get_array()

dionhaefner commented 5 years ago

I see. So this is only an issue if the bh array goes out of scope during the call to get_array, which is probably fine (but should be documented).