Nek5000 / nekRS

our next generation fast and scalable CFD code
https://nek5000.mcs.anl.gov/
Other
269 stars 71 forks source link

MPI_FINALIZE ERROR #402

Closed AdwardAllan closed 2 years ago

AdwardAllan commented 2 years ago

Dear nekers: currently, I run the example/channel , the nrspre channel 1 is correct , the nrsmpi channel 1 could run and the result is in test.txt which is atached , but finally get the error:" MPI_Finalize failed" which list below. Env: MPICH2 and CUDA11.0 complier : gcc/gfortran. Thanks Best wishes to you. test.txt

➜ chrs nrsmpi channel 1 > test.txt Abort(808022543): Fatal error in internal_Finalize: Other MPI error, error stack: internal_Finalize(50)...........: MPI_Finalize failed MPII_Finalize(345)..............: MPID_Finalize(495)..............: MPIDI_OFI_mpi_finalize_hook(895): destroy_vni_context(1137).......: OFI domain close failed (ofi_init.c:1137:destroy_vni_context:Device or resource busy)

Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted.


mpirun detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was:

Process name: [[45606,1],0] Exit code: 15

stgeke commented 2 years ago

Fixed in https://github.com/Nek5000/nekRS/commit/9ae887200454a3409e8a0c324bde9989815716c3