devitocodes / devito

DSL and compiler framework for automated finite-differences and stencil computation
http://www.devitoproject.org
MIT License
562 stars 228 forks source link

mpi: MPI cross-rank assignments fail #2217

Open georgebisbas opened 1 year ago

georgebisbas commented 1 year ago

Cross-rank assignments fail, while expected to work

from devito import Grid, TimeFunction, Eq, Operator

grid = Grid(shape=(4, 4))
u = TimeFunction(name="u", grid=grid, space_order=2)

u.data[0, 1:-1, 1:-1] = 1.

u.data[0, 0, 0] = u.data[0, 0, 2]

To reproduce:

(python-3.10) gb4018@titaros:~/workspace/devito$ DEVITO_LOGGING=DEBUG DEVITO_MPI=1 mpirun -n 4 python mpi_mfe.py
Allocating host memory for u(2, 6, 6) [288 B]
Allocating host memory for u(2, 6, 6) [288 B]
Allocating host memory for u(2, 6, 6) [288 B]
Allocating host memory for u(2, 6, 6) [288 B]
Traceback (most recent call last):
  File "/home/gb4018/workspace/devito/mpi_mfe.py", line 8, in <module>
    u.data[0, 0, 0] = u.data[0, 0, 2]
  File "/home/gb4018/workspace/devito/devito/data/data.py", line 187, in wrapper
    return func(data, *args, **kwargs)
  File "/home/gb4018/workspace/devito/devito/data/data.py", line 400, in __setitem__
    raise ValueError("Cannot insert obj of type `%s` into a Data" % type(val))
ValueError: Cannot insert obj of type `<class 'NoneType'>` into a Data
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[44560,1],0]
  Exit code:    1

hint: "fake" one-size slices work

u.data[0, 0, 0:1] = u.data[0, 0, 2:3]
georgebisbas commented 1 year ago

Any hints on where should this fix be placed?