devitocodes / devito

DSL and compiler framework for automated finite-differences and stencil computation
http://www.devitoproject.org
MIT License
562 stars 228 forks source link

TimeFunctions with time_order=0 should raise an exception (or warning) if used in parallel loops #981

Open mikeando opened 5 years ago

mikeando commented 5 years ago

I'm running the following example code:

from mpi4py import MPI
from devito import Grid, TimeFunction, Eq, Operator, configuration

configuration['compiler'] = 'intel'
configuration['platform'] = 'intel64'

configuration['mpi'] = 'full'
# configuration['mpi'] = True

configuration['openmp'] = False

grid = Grid(shape=(4, 4))
u = TimeFunction(name="u", grid=grid, space_order=2, time_order=0)

u.data[0, 1:-1, 1:-1] = 1

op = Operator(Eq(u.forward, u.dx + 1))
summary = op.apply(time_M=0)

print(repr(u.data_with_halo))

Which I run using 4 processes on my local machine using:

OMPI_MPICC=${ICCBINDIR}/icc OMPI_MPICXX=${ICCBINDIR}/icpc mpiexec -n 4 python mpi_test.py

When run with configuration['mpi']=True I get the correct result of:

Data([[[ 0.  ,  0.  ,  0.  ,  0.  ],
       [ 0.  ,  0.  ,  0.  ,  0.  ],
       [ 0.  ,  0.  ,  1.  ,  2.5 ],
       [ 0.  ,  0.  , -0.5 , -1.25]]], dtype=float32)
Data([[[ 0.  ,  0.  ,  0.  ,  0.  ],
       [ 0.  ,  0.  ,  0.  ,  0.  ],
       [ 2.5 ,  1.  ,  0.  ,  0.  ],
       [-1.25, -0.5 ,  0.  ,  0.  ]]], dtype=float32)
Data([[[ 0.  ,  0.  ,  1.  , -0.5 ],
       [ 0.  ,  0.  , -0.5 ,  1.75],
       [ 0.  ,  0.  ,  0.  ,  0.  ],
       [ 0.  ,  0.  ,  0.  ,  0.  ]]], dtype=float32)
Data([[[-0.5 ,  1.  ,  0.  ,  0.  ],
       [ 1.75, -0.5 ,  0.  ,  0.  ],
       [ 0.  ,  0.  ,  0.  ,  0.  ],
       [ 0.  ,  0.  ,  0.  ,  0.  ]]], dtype=float32)

But if I run with configuration['mpi']='full' I get rubbish:

Data([[[  0.       ,   0.       ,   0.       ,   0.       ],
       [  0.       ,   0.       ,   0.       ,   0.       ],
       [  0.       ,   0.       ,  -1.859375 , -10.3671875],
       [  0.       ,   0.       ,   3.7890625,  18.050781 ]]],
     dtype=float32)
Data([[[  0.       ,   0.       ,   0.       ,   0.       ],
       [  0.       ,   0.       ,   0.       ,   0.       ],
       [-10.3671875,  -1.859375 ,   0.       ,   0.       ],
       [ 18.050781 ,   3.7890625,   0.       ,   0.       ]]],
     dtype=float32)
Data([[[  0.       ,   0.       ,  -1.859375 ,   9.5078125],
       [  0.       ,   0.       ,   3.7890625, -13.261719 ],
       [  0.       ,   0.       ,   0.       ,   0.       ],
       [  0.       ,   0.       ,   0.       ,   0.       ]]],
     dtype=float32)
Data([[[  9.5078125,  -1.859375 ,   0.       ,   0.       ],
       [-13.261719 ,   3.7890625,   0.       ,   0.       ],
       [  0.       ,   0.       ,   0.       ,   0.       ],
       [  0.       ,   0.       ,   0.       ,   0.       ]]],
     dtype=float32)

I'm running on linux, my icc version is icc (ICC) 19.0.3.199 20190206 and my mpi version is

> ompi_info
                 Package: Open MPI conda@8860fd2e8f15 Distribution
                Open MPI: 4.0.1

My devito version is current master:

58563348 - (devito/master) Merge pull request #974 from opesci/fix-bench-hierachical-blocking

Let me know if any more information would be helpful

FabioLuporini commented 5 years ago

Turns out this is not a compilation bug, but rather devito unable to realise that time_order=0 basically sequentialises the space loops. Indeed, this would fail even w/ OpenMP a w/o MPI.

time_order=0 boils down to executing something like

u[0, x] = u[0, x-1] + u[0, x+1]

which clearly isn't parallel along x

this should be intercepted and a proper error/warning be returned

FabioLuporini commented 5 years ago

602 isn't really the same issue, but it's still about the time buffer size