zer011b / fdtd3d

fdtd3d is an open source 1D, 2D, 3D FDTD electromagnetics solver with MPI, OpenMP and CUDA support for x64, ARM, ARM64, RISC-V, PowerPC architectures
GNU General Public License v2.0
119 stars 33 forks source link

Intel MPI execution fail #95

Closed zer011b closed 4 years ago

zer011b commented 6 years ago

build:

cmake .. -DCMAKE_BUILD_TYPE=Release -DVALUE_TYPE=d -DCOMPLEX_FIELD_VALUES=OFF -DTIME_STEPS=2 -DPARALLEL_GRID_DIMENSION=3 -DPRINT_MESSAGE=ON -DPARALLEL_GRID=ON -DPARALLEL_BUFFER_DIMENSION=xyz -DCXX11_ENABLED=ON -DCUDA_ENABLED=OFF -DCUDA_ARCH_SM_TYPE=sm_50

run:

mpiexec -n 8 ./fdtd3d --3d --parallel-grid --time-steps 100 --sizex 100 --sizey 100 --sizez 110

output:

Fatal error in MPI_Recv: Other MPI error, error stack:
MPI_Recv(224)...................: MPI_Recv(buf=0x1be3280, count=8250, MPI_DOUBLE, src=1, tag=1, comm=0x84000004, status=0x7ffe7aab74b0) failed
PMPIDI_CH3I_Progress(623).......: fail failed
pkt_RTS_handler(317)............: fail failed
do_cts(662).....................: fail failed
MPID_nem_lmt_dcp_start_recv(288): fail failed
dcp_recv(154)...................: Internal MPI error! cannot read from remote process
Fatal error in MPI_Recv: Other MPI error, error stack:
MPI_Recv(224)...................: MPI_Recv(buf=0x7f0240, count=8250, MPI_DOUBLE, src=3, tag=3, comm=0x84000002, status=0x7ffd79339860) failed
PMPIDI_CH3I_Progress(623).......: fail failed
pkt_RTS_handler(317)............: fail failed
do_cts(662).....................: fail failed
MPID_nem_lmt_dcp_start_recv(288): fail failed
dcp_recv(154)...................: Internal MPI error! cannot read from remote process
Fatal error in MPI_Recv: Other MPI error, error stack:
MPI_Recv(224)...................: MPI_Recv(buf=0x1d2a2d0, count=8250, MPI_DOUBLE, src=5, tag=5, comm=0x84000002, status=0x7fff4f7eb820) failed
PMPIDI_CH3I_Progress(623).......: fail failed
pkt_RTS_handler(317)............: fail failed
do_cts(662).....................: fail failed
MPID_nem_lmt_dcp_start_recv(288): fail failed
dcp_recv(154)...................: Internal MPI error! cannot read from remote process
Fatal error in MPI_Recv: Other MPI error, error stack:
MPI_Recv(224)...................: MPI_Recv(buf=0x19ff1f0, count=8250, MPI_DOUBLE, src=7, tag=7, comm=0x84000002, status=0x7ffeb08d5d50) failed
PMPIDI_CH3I_Progress(623).......: fail failed
pkt_RTS_handler(317)............: fail failed
do_cts(662).....................: fail failed
MPID_nem_lmt_dcp_start_recv(288): fail failed
dcp_recv(154)...................: Internal MPI error! cannot read from remote process

see #94 for details

zer011b commented 4 years ago

Does not reproduce on the latest master