Open shruticd opened 10 months ago
@shruticd hi,
Please give me more information: 1) Which MPI did you use? 2) How did you run the benchmarks? 3) Please attach full output log?
@JuliaRS hi, I used Mvapich2 - 2.3.7 with psm2 and IMB-v2021.7. The command I used: mpirun -n 2 ./IMB-IO C_Read_Shared
NBC - Ireduce_Scatter
Intel(R) MPI Benchmarks 2018, MPI-NBC part
Date : Mon Jan 15 12:10:29 2024 Machine : x86_64 System : Linux Release : 3.10.0-957.1.3.el7.x86_64 Version : #1 SMP Thu Nov 29 14:49:43 UTC 2018 MPI Version : 3.1 MPI Thread Environment:
Calling sequence was:
./IMB-NBC Ireduce_scatter
Minimum message length in bytes: 0 Maximum message length in bytes: 4194304
MPI_Datatype : MPI_BYTE MPI_Datatype for reductions : MPI_FLOAT MPI_Op : MPI_SUM
List of Benchmarks to run:
Ireduce_scatter
bytes repetitions t_ovrl[usec] t_pure[usec] t_CPU[usec] overlap[%] defects
0 1000 0.63 0.30 0.30 0.00 0.00
1: Error Ireduce_scatter_pure,size = 4,sample #0 Process 1: Got invalid buffer: Buffer entry: 0.000000 pos: 0 Process 1: Expected buffer: Buffer entry: 0.300000 4 1000 1.87 1.00 0.85 0.00 0.00 Application error code 1 occurred application called MPI_Abort(MPI_COMM_WORLD, 16) - process 1 [cli_1]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 16) - process 1
Intel(R) MPI Benchmarks 2018, MPI-IO partn----------------------------------------------------------------
Date : Mon Jan 15 12:14:51 2024 Machine : x86_64 System : Linux Release : 3.10.0-957.1.3.el7.x86_64 Version : #1 SMP Thu Nov 29 14:49:43 UTC 2018 MPI Version : 3.1 MPI Thread Environment:
Calling sequence was:
./IMB-IO C_Read_Shared
Minimum io portion in bytes: 0 Maximum io portion in bytes: 4194304
List of Benchmarks to run:
C_Read_Shared
MODE: AGGREGATE
bytes repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec defects
0 1000 5.48 5.48 5.48 0.00 0.00
Fatal error in PMPI_Gather: Invalid buffer pointer, error stack: PMPI_Gather(929): MPI_Gather(sbuf=0x7ffe51b7bc20, scount=1, MPI_INT, rbuf=(nil), rcount=1, MPI_INT, root=0, comm=0xc4000003) failed PMPI_Gather(851): Null buffer pointer [cli_0]: aborting job: Fatal error in PMPI_Gather: Invalid buffer pointer, error stack: PMPI_Gather(929): MPI_Gather(sbuf=0x7ffe51b7bc20, scount=1, MPI_INT, rbuf=(nil), rcount=1, MPI_INT, root=0, comm=0xc4000003) failed PMPI_Gather(851): Null buffer pointer
Intel(R) MPI Benchmarks 2018, MPI-IO partn----------------------------------------------------------------
Date : Mon Jan 15 12:13:58 2024 Machine : x86_64 System : Linux Release : 3.10.0-957.1.3.el7.x86_64 Version : #1 SMP Thu Nov 29 14:49:43 UTC 2018 MPI Version : 3.1 MPI Thread Environment:
Calling sequence was:
./IMB-IO P_IREAD_Shared Minimum io portion in bytes: 0 Maximum io portion in bytes: 4194304
List of Benchmarks to run:
P_IRead_Shared
For nonblocking benchmarks:
Function CPU_Exploit obtains an undisturbed performance of 745.98 MFlops
MODE: AGGREGATE
bytes repetitions t_ovrl[usec] t_pure[usec] t_CPU[usec] overlap[%] defects
0 1000 3401.78 0.46 1030.45 0.00 0.00
Fatal error in PMPI_Gather: Invalid buffer pointer, error stack: PMPI_Gather(929): MPI_Gather(sbuf=0x7fff82c51f60, scount=1, MPI_INT, rbuf=(nil), rcount=1, MPI_INT, root=0, comm=0xc4000003) failed PMPI_Gather(851): Null buffer pointer [cli_0]: aborting job: Fatal error in PMPI_Gather: Invalid buffer pointer, error stack: PMPI_Gather(929): MPI_Gather(sbuf=0x7fff82c51f60, scount=1, MPI_INT, rbuf=(nil), rcount=1, MPI_INT, root=0, comm=0xc4000003) failed PMPI_Gather(851): Null buffer pointer
Intel(R) MPI Benchmarks 2018, MPI-IO partn----------------------------------------------------------------
Date : Mon Jan 15 12:12:38 2024 Machine : x86_64 System : Linux Release : 3.10.0-957.1.3.el7.x86_64 Version : #1 SMP Thu Nov 29 14:49:43 UTC 2018 MPI Version : 3.1 MPI Thread Environment:
Calling sequence was:
./IMB-IO P_IWrite_shared
Minimum io portion in bytes: 0 Maximum io portion in bytes: 4194304
List of Benchmarks to run:
P_IWrite_Shared
For nonblocking benchmarks:
Function CPU_Exploit obtains an undisturbed performance of 753.20 MFlops
MODE: AGGREGATE
bytes repetitions t_ovrl[usec] t_pure[usec] t_CPU[usec] overlap[%] defects
0 1000 1081.32 1.53 1004.70 0.00 0.00
[mpi_rank_0][error_sighandler] Caught error: Segmentation fault (signal 11)
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11) This typically refers to a problem with your application. Please see the FAQ page for debugging suggestions
Intel(R) MPI Benchmarks 2018, MPI-IO partn----------------------------------------------------------------
Date : Mon Jan 15 12:13:33 2024 Machine : x86_64 System : Linux Release : 3.10.0-957.1.3.el7.x86_64 Version : #1 SMP Thu Nov 29 14:49:43 UTC 2018 MPI Version : 3.1 MPI Thread Environment:
Calling sequence was:
./IMB-IO P_READ_Shared
Minimum io portion in bytes: 0 Maximum io portion in bytes: 4194304
List of Benchmarks to run:
P_Read_Shared
MODE: AGGREGATE
bytes repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec defects
0 1000 0.46 0.46 0.46 0.00 0.00
Fatal error in PMPI_Gather: Invalid buffer pointer, error stack: PMPI_Gather(929): MPI_Gather(sbuf=0x7ffe753f1000, scount=1, MPI_INT, rbuf=(nil), rcount=1, MPI_INT, root=0, comm=0xc4000003) failed PMPI_Gather(851): Null buffer pointer [cli_0]: aborting job: Fatal error in PMPI_Gather: Invalid buffer pointer, error stack: PMPI_Gather(929): MPI_Gather(sbuf=0x7ffe753f1000, scount=1, MPI_INT, rbuf=(nil), rcount=1, MPI_INT, root=0, comm=0xc4000003) failed PMPI_Gather(851): Null buffer pointer
Intel(R) MPI Benchmarks 2018, MPI-IO partn----------------------------------------------------------------
Date : Mon Jan 15 12:12:11 2024 Machine : x86_64 System : Linux Release : 3.10.0-957.1.3.el7.x86_64 Version : #1 SMP Thu Nov 29 14:49:43 UTC 2018 MPI Version : 3.1 MPI Thread Environment:
Calling sequence was:
./IMB-IO P_Write_shared
Minimum io portion in bytes: 0 Maximum io portion in bytes: 4194304
List of Benchmarks to run:
P_Write_Shared
MODE: AGGREGATE
bytes repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec defects
0 1000 1.14 1.14 1.14 0.00 0.00
[mpi_rank_0][error_sighandler] Caught error: Segmentation fault (signal 11)
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11) This typically refers to a problem with your application. Please see the FAQ page for debugging suggestions
Intel(R) MPI Benchmarks 2018, MPI-IO ----------------------------------------------------------------
Date : Mon Jan 15 12:30:04 2024 Machine : x86_64 System : Linux Release : 3.10.0-957.1.3.el7.x86_64 Version : #1 SMP Thu Nov 29 14:49:43 UTC 2018 MPI Version : 3.1 MPI Thread Environment:
Calling sequence was:
./IMB-IO C_IRead_Shared
Minimum io portion in bytes: 0 Maximum io portion in bytes: 4194304
List of Benchmarks to run:
C_IRead_Shared
For nonblocking benchmarks:
Function CPU_Exploit obtains an undisturbed performance of 740.11 MFlops
MODE: AGGREGATE
bytes repetitions t_ovrl[usec] t_pure[usec] t_CPU[usec] overlap[%] defects
0 1000 1016.93 5.48 987.29 0.00 0.00
Fatal error in PMPI_Gather: Invalid buffer pointer, error stack: PMPI_Gather(929): MPI_Gather(sbuf=0x7ffd7ac58ae0, scount=1, MPI_INT, rbuf=(nil), rcount=1, MPI_INT, root=0, comm=0xc4000003) failed PMPI_Gather(851): Null buffer pointer [cli_0]: aborting job: Fatal error in PMPI_Gather: Invalid buffer pointer, error stack: PMPI_Gather(929): MPI_Gather(sbuf=0x7ffd7ac58ae0, scount=1, MPI_INT, rbuf=(nil), rcount=1, MPI_INT, root=0, comm=0xc4000003) failed PMPI_Gather(851): Null buffer pointer
@shruticd did you try to use running with environment variable FI_PROVIDER=tcp? It might be provider problem. I checked the same becnhmarks with Intel MPI and it works. Also, you can try to use IMB2021.8
In IMB-NBC, I receive an integrity fail in Ireduce Scatter with standard Intel Omni Path. In IMB-IO, P_Write_shared, P_IWrite_Shared, P_READ_Shared, P_IRead_Shared, C_Read_Shared and C_IRead_Shared represent with either a segmentation fault or an integrity fail. can you please tell why this is happening?