Open boegel opened 10 years ago
We noticed that PerfExpert was reporting 0% floating point instructions for a test program that was heavily using AVX FP instructions.
After looking into this with @leonardofialho, it turns out the ratio.floating_point defined in lcpi.conf is missing the SIMD_FP_256 events.
ratio.floating_point
lcpi.conf
SIMD_FP_256
The following patch (post-installation) seems to fix the issue:
--- PerfExpert/4.1.1/etc/lcpi.conf.orig 2014-05-07 15:42:20.010888000 +0200 +++ PerfExpert/4.1.1/etc/lcpi.conf 2014-05-07 15:45:25.940577000 +0200 @@ -1,6 +1,6 @@ # LCPI config generated using sniffer # version = 1.0 -ratio.floating_point = FP_COMP_OPS_EXE:SSE_PACKED_SINGLE + FP_COMP_OPS_EXE:SSE_FP_PACKED_DOUBLE + FP_COMP_OPS_EXE:SSE_FP_SCALAR_SINGLE + FP_COMP_OPS_EXE:SSE_SCALAR_DOUBLE / PAPI_TOT_INS +ratio.floating_point = SIMD_FP_256:PACKED_SINGLE + SIMD_FP_256:PACKED_DOUBLE + FP_COMP_OPS_EXE:SSE_PACKED_SINGLE + FP_COMP_OPS_EXE:SSE_FP_PACKED_DOUBLE + FP_COMP_OPS_EXE:SSE_FP_SCALAR_SINGLE + FP_COMP_OPS_EXE:SSE_SCALAR_DOUBLE / PAPI_TOT_INS ratio.data_accesses = PAPI_LD_INS / PAPI_TOT_INS GFLOPS_(%_max).overall = ((SIMD_FP_256:PACKED_SINGLE*8 + (SIMD_FP_256:PACKED_DOUBLE + FP_COMP_OPS_EXE:SSE_PACKED_SINGLE)*4 + FP_COMP_OPS_EXE:SSE_FP_PACKED_DOUBLE*2 + FP_COMP_OPS_EXE:SSE_FP_SCALAR_SINGLE + FP_COMP_OPS_EXE:SSE_SCALAR_DOUBLE) / PAPI_TOT_CYC) / 8 GFLOPS_(%_max).packed = ((SIMD_FP_256:PACKED_SINGLE*8 + (SIMD_FP_256:PACKED_DOUBLE + FP_COMP_OPS_EXE:SSE_PACKED_SINGLE)*4 + FP_COMP_OPS_EXE:SSE_FP_PACKED_DOUBLE*2) / PAPI_TOT_CYC) / 8 @@ -20,6 +20,6 @@ branch_instructions.overall = (PAPI_BR_INS * BR_lat + PAPI_BR_MSP * BR_miss_lat) / PAPI_TOT_INS branch_instructions.correctly_predicted = PAPI_BR_INS * BR_lat / PAPI_TOT_INS branch_instructions.mispredicted = PAPI_BR_MSP * BR_miss_lat / PAPI_TOT_INS -floating-point_instr.overall = (((FP_COMP_OPS_EXE:SSE_FP_PACKED_DOUBLE + FP_COMP_OPS_EXE:SSE_FP_SCALAR_SINGLE + FP_COMP_OPS_EXE:SSE_PACKED_SINGLE + FP_COMP_OPS_EXE:SSE_SCALAR_DOUBLE) * FP_lat) + (PAPI_FDV_INS * FP_slow_lat)) / PAPI_TOT_INS +floating-point_instr.overall = (((SIMD_FP_256:PACKED_SINGLE + SIMD_FP_256:PACKED_DOUBLE + FP_COMP_OPS_EXE:SSE_FP_PACKED_DOUBLE + FP_COMP_OPS_EXE:SSE_FP_SCALAR_SINGLE + FP_COMP_OPS_EXE:SSE_PACKED_SINGLE + FP_COMP_OPS_EXE:SSE_SCALAR_DOUBLE) * FP_lat) + (PAPI_FDV_INS * FP_slow_lat)) / PAPI_TOT_INS floating-point_instr.slow_FP_instr = (PAPI_FDV_INS * FP_slow_lat) / PAPI_TOT_INS floating-point_instr.fast_FP_instr = ((FP_COMP_OPS_EXE:SSE_FP_PACKED_DOUBLE + FP_COMP_OPS_EXE:SSE_FP_SCALAR_SINGLE + FP_COMP_OPS_EXE:SSE_PACKED_SINGLE + FP_COMP_OPS_EXE:SSE_SCALAR_DOUBLE) * FP_lat) / PAPI_TOT_INS
We noticed that PerfExpert was reporting 0% floating point instructions for a test program that was heavily using AVX FP instructions.
After looking into this with @leonardofialho, it turns out the
ratio.floating_point
defined inlcpi.conf
is missing theSIMD_FP_256
events.The following patch (post-installation) seems to fix the issue: