bayesiancook / pbmpi

phylobayes mpi
GNU General Public License v2.0
23 stars 9 forks source link

issues running pb_mpi #11

Closed zhoulhca closed 4 years ago

zhoulhca commented 5 years ago

Hi there,

I'm trying to run pb_mpi and I am getting this error message:

OPAL ERROR: Not initialized in file pmix3x_client.c at line 113 An error occurred in MPI_Init on a NULL communicator MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, and potentially your MPI job) [bailey-data:11215] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed! [bailey-data:11216] OPAL ERROR: Not initialized in file pmix3x_client.c at line 113 An error occurred in MPI_Init on a NULL communicator MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, and potentially your MPI job) [bailey-data:11216] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed! [bailey-data:11217] OPAL ERROR: Not initialized in file pmix3x_client.c at line 113 An error occurred in MPI_Init on a NULL communicator MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, and potentially your MPI job) [bailey-data:11217] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed! [bailey-data:11218] OPAL ERROR: Not initialized in file pmix3x_client.c at line 113 An error occurred in MPI_Init on a NULL communicator MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, and potentially your MPI job) [bailey-data:11218] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed! [bailey-data:11219] OPAL ERROR: Not initialized in file pmix3x_client.c at line 113 An error occurred in MPI_Init on a NULL communicator MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, and potentially your MPI job) [bailey-data:11219] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed! [bailey-data:11220] OPAL ERROR: Not initialized in file pmix3x_client.c at line 113 An error occurred in MPI_Init on a NULL communicator MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, and potentially your MPI job) [bailey-data:11220] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed! [bailey-data:11221] OPAL ERROR: Not initialized in file pmix3x_client.c at line 113 An error occurred in MPI_Init on a NULL communicator MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, and potentially your MPI job) [bailey-data:11221] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed! [bailey-data:11222] OPAL ERROR: Not initialized in file pmix3x_client.c at line 113 An error occurred in MPI_Init on a NULL communicator MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, and potentially your MPI job) [bailey-data:11222] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!

mpirun noticed that the job aborted, but has no info as to the process that caused that situation.

I'm not quite sure what's happening, thank you for your help!!

bayesiancook commented 5 years ago

Hi there, and does that work now with the latest commit ? nicolas