Closed andcarminati closed 2 hours ago
QoR results:
Core Compute Cycle Count:
|--------------------------------------------------------------|------------|---------|---------------|
| Core_Compute_Cycle_Count | aie-public | This PR | Total diff |
|--------------------------------------------------------------|------------|---------|---------------|
| Floor_aie2_0 | 315 | 371 | REGR(+17.78%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_FC_0 | 2650 | 2929 | REGR(+10.53%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Pad3D_AIE2_bfloat16 | 9208 | 9348 | REGR(+1.52%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_bf16_0 | 23275 | 23472 | REGR(+0.85%) |
|--------------------------------------------------------------|------------|---------|---------------|
| CompareOpsBroadcasting_K_EQ_GE_GT_LE_LT_CMP_GE_int8_aie2 | 960 | 967 | REGR(+0.73%) |
|--------------------------------------------------------------|------------|---------|---------------|
| CompareOps_K_EQ_GE_GT_LE_LT_CMP_EQ_int8_aie2 | 966 | 973 | REGR(+0.72%) |
|--------------------------------------------------------------|------------|---------|---------------|
| CompareOps_K_EQ_GE_GT_LE_LT_CMP_GE_int8_aie2 | 978 | 985 | REGR(+0.72%) |
|--------------------------------------------------------------|------------|---------|---------------|
| CompareOps_K_EQ_GE_GT_LE_LT_CMP_GE_int8_aie2_ptr_interface | 978 | 985 | REGR(+0.72%) |
|--------------------------------------------------------------|------------|---------|---------------|
| CompareOpsAttributeBroadcasting_aie2_int8 | 1185 | 1192 | REGR(+0.59%) |
|--------------------------------------------------------------|------------|---------|---------------|
| CompareOpsAttributeBroadcasting_aie2_bf16 | 1499 | 1507 | REGR(+0.53%) |
|--------------------------------------------------------------|------------|---------|---------------|
| CompareOpsBroadcasting_K_EQ_GE_GT_LE_LT_CMP_GE_bfloat16_aie2 | 1455 | 1462 | REGR(+0.48%) |
|--------------------------------------------------------------|------------|---------|---------------|
| CompareOps_K_EQ_GE_GT_LE_LT_CMP_EQ_bfloat16_aie2 | 1461 | 1468 | REGR(+0.48%) |
|--------------------------------------------------------------|------------|---------|---------------|
| CompareOps_K_EQ_GE_GT_LE_LT_CMP_GE_bfloat16_aie2 | 1474 | 1481 | REGR(+0.47%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_bf16_1 | 38198 | 38367 | REGR(+0.44%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Clip_aie2_int8 | 246 | 247 | REGR(+0.41%) |
|--------------------------------------------------------------|------------|---------|---------------|
| CompareOps_K_EQ_GE_GT_LE_LT_CMP_GT_int32_aie2 | 1098 | 1101 | REGR(+0.27%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceProdAxis_5_aie2_bf16 | 8707 | 8730 | REGR(+0.26%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceProdAxis_6_aie2_bf16 | 8688 | 8709 | REGR(+0.24%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceProdAxis_4_aie2_bf16 | 35414 | 35498 | REGR(+0.24%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceProdAxis_1_aie2_bf16 | 35383 | 35466 | REGR(+0.23%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceProdAxis_2_aie2_bf16 | 17783 | 17821 | REGR(+0.21%) |
|--------------------------------------------------------------|------------|---------|---------------|
| HardSigmoidTemplated_bf16_0 | 557 | 558 | REGR(+0.18%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceSumAxis_1_aie2_bf16 | 11868 | 11884 | REGR(+0.13%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceSumAxis_2_aie2_bf16 | 11884 | 11900 | REGR(+0.13%) |
|--------------------------------------------------------------|------------|---------|---------------|
| AddAttributeBroadcasting_aie2_bf16 | 762 | 763 | REGR(+0.13%) |
|--------------------------------------------------------------|------------|---------|---------------|
| SubAttributeBroadcasting_aie2_bf16_0 | 762 | 763 | REGR(+0.13%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceMeanAxis_1_aie2_bf16 | 13024 | 13041 | REGR(+0.13%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceMeanAxis_4_aie2_bf16 | 13030 | 13047 | REGR(+0.13%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceMeanAxis_2_aie2_bf16 | 13060 | 13077 | REGR(+0.13%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceMeanAxis_5_aie2_bf16 | 7204 | 7213 | REGR(+0.12%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceMeanAxis_6_aie2_bf16 | 7211 | 7220 | REGR(+0.12%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceMeanAxis_3_aie2_bf16 | 7225 | 7234 | REGR(+0.12%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceSumAxis_4_aie2_bf16 | 11906 | 11920 | REGR(+0.12%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_DW_1 | 853 | 854 | REGR(+0.12%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceSumAxis_6_aie2_bf16 | 7030 | 7038 | REGR(+0.11%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceSumAxis_3_aie2_bf16 | 7044 | 7052 | REGR(+0.11%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceSumAxis_5_aie2_bf16 | 7047 | 7055 | REGR(+0.11%) |
|--------------------------------------------------------------|------------|---------|---------------|
| MulAttributeBroadcasting_aie2_bf16_0 | 893 | 894 | REGR(+0.11%) |
|--------------------------------------------------------------|------------|---------|---------------|
| LayerNorm_1 | 16195 | 16213 | REGR(+0.11%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceProdAxis_3_aie2_bf16 | 8713 | 8722 | REGR(+0.10%) |
|--------------------------------------------------------------|------------|---------|---------------|
| LayerNorm_0 | 19133 | 19151 | SAME(+0.09%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_DW_bf16_0 | 1177 | 1178 | SAME(+0.08%) |
|--------------------------------------------------------------|------------|---------|---------------|
| MaxPool2D_1 | 1260 | 1261 | SAME(+0.08%) |
|--------------------------------------------------------------|------------|---------|---------------|
| LayerNormC8Part2_aie2_bf16_0 | 11254 | 11262 | SAME(+0.07%) |
|--------------------------------------------------------------|------------|---------|---------------|
| MaxPool2D_0 | 1468 | 1469 | SAME(+0.07%) |
|--------------------------------------------------------------|------------|---------|---------------|
| BitShift_AIE2_int8 | 2008 | 2009 | SAME(+0.05%) |
|--------------------------------------------------------------|------------|---------|---------------|
| InstanceNormPart2_aie2_bf16_0 | 9528 | 9532 | SAME(+0.04%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_1 | 2452 | 2453 | SAME(+0.04%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_DW_0 | 2941 | 2942 | SAME(+0.03%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceMeanAxis_6_aie2_int8 | 2954 | 2955 | SAME(+0.03%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceMeanAxis_3_aie2_int8 | 2958 | 2959 | SAME(+0.03%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceMeanAxis_5_aie2_int8 | 2975 | 2976 | SAME(+0.03%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Pad3D_AIE2_int8 | 9595 | 9598 | SAME(+0.03%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_DW_bf16_1 | 3894 | 3895 | SAME(+0.03%) |
|--------------------------------------------------------------|------------|---------|---------------|
| DivAttributeBroadcasting_aie2_bf16_0 | 5372 | 5373 | SAME(+0.02%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceMeanAxis_7_aie2_bf16 | 6263 | 6264 | SAME(+0.02%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceMeanAxis_1_aie2_int8 | 7064 | 7065 | SAME(+0.01%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceMeanAxis_4_aie2_int8 | 7091 | 7092 | SAME(+0.01%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceMeanAxis_2_aie2_int8 | 7124 | 7125 | SAME(+0.01%) |
|--------------------------------------------------------------|------------|---------|---------------|
| InterpolateLinear1D_AIE2_bfloat16 | 14464 | 14466 | SAME(+0.01%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Abs_bf16_0 | 376 | 376 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Abs_int8_0 | 510 | 510 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Add2D_0 | 217 | 217 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Add2D_1 | 435 | 435 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| AddAttributeBroadcasting_aie2_int8 | 807 | 807 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| AddBf16_aie2_0 | 673 | 673 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| AddBroadcastingBf16_aie2_0 | 728 | 728 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| AddBroadcasting_aie2_0 | 776 | 776 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Add_aie2_0 | 726 | 726 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| AvgPool2D_0 | 1068 | 1068 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| AvgPool2D_1 | 780 | 780 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| AvgPool2D_aie2_bfloat16_0 | 3247 | 3247 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| AvgPool2D_aie2_bfloat16_1 | 2247 | 2247 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| AvgPool2D_aie2_int8_0 | 1068 | 1068 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| AvgPool2D_aie2_int8_1 | 780 | 780 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| BilinearInterpolation_0 | 667 | 667 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| BilinearInterpolation_1 | 361 | 361 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| BitwiseAnd_int8_0 | 467 | 467 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| BitwiseNot_aie2_0 | 135 | 135 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| BitwiseOr_int8_0 | 467 | 467 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| BitwiseXor_aie2_int8 | 710 | 710 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Cast_aie2_bfloat16 | 974 | 974 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Cast_aie2_bfloat16_1 | 974 | 974 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Cast_aie2_int8 | 725 | 725 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Cast_aie2_int8_1 | 725 | 725 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Ceil_AIE2_bfloat16 | 1412 | 1412 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Ceil_AIE2_int8 | 446 | 446 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ChannelsFirstFlatten_bf16_0 | 13604 | 13604 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ChannelsFirstFlatten_int8_0 | 11932 | 11932 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Clip_aie2_bf16 | 227 | 227 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv1D_DW_AIE2_bf16_0 | 3358 | 3358 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv1D_DW_AIE2_bf16_1 | 3902 | 3902 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv1D_DW_AIE2_int8_0 | 1539 | 1539 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv1D_DW_AIE2_int8_1 | 1773 | 1773 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_2x8_0 | 1817 | 1817 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_2x8_1 | 3822 | 3822 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_ReLU_int8_0 | 10139 | 10139 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_ReLU_int8_1 | 927 | 927 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_edge_mode_0 | 30301 | 30301 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_edge_mode_1 | 18719 | 18719 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| DegroupG4_aie2_bf16_0 | 603 | 603 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| DegroupG4_aie2_bf16_1 | 990 | 990 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| DegroupG4_aie2_int8_0 | 364 | 364 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| DegroupG4_aie2_int8_1 | 558 | 558 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| DegroupG8_aie2_bf16_0 | 747 | 747 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| DegroupG8_aie2_bf16_1 | 1149 | 1149 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| DegroupG8_aie2_int8_0 | 436 | 436 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| DegroupG8_aie2_int8_1 | 637 | 637 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| DivAttributeBroadcasting_aie2_int8_0 | 7802 | 7802 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| DivBroadcasting_aie2_0 | 2059 | 2059 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| DivBroadcasting_aie2_1 | 1450 | 1450 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| EleMax_aie2_bfloat16 | 227 | 227 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| EleMax_aie2_int8 | 164 | 164 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| EleMin_aie2_bfloat16 | 227 | 227 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| EleMin_aie2_int8 | 164 | 164 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ElemDiv_aie2_0 | 2001 | 2001 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ElemDiv_aie2_1 | 1388 | 1388 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Erf_aie2_bf16_0 | 2770 | 2770 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Erf_aie2_int8_0 | 2554 | 2554 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Erf_aie2_int8_0_ptr_interface | 2533 | 2533 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Floor_aie2_1 | 881 | 881 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| GELU_0 | 2144 | 2144 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| GELU_1 | 2811 | 2811 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| GeluTemplated_aie2_bf16 | 1388 | 1388 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| GeluTemplated_aie2_int8 | 1214 | 1214 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| GroupG4_aie2_bf16_0 | 495 | 495 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| GroupG4_aie2_int8_0 | 312 | 312 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| GroupG8_aie2_bf16_0 | 1026 | 1026 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| GroupG8_aie2_int8_0 | 555 | 555 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| HardSigmoidTemplated_int8_0 | 284 | 284 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| HardSigmoid_bf16_0 | 937 | 937 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| HardSigmoid_bf16_1 | 649 | 649 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| HardSigmoid_int8_0 | 417 | 417 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| HardSigmoid_int8_1 | 427 | 427 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| HardswishAsHardsigmoid_aie2_0 | 1368 | 1368 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| HardswishAsHardsigmoid_aie2_1 | 1527 | 1527 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Hardswish_aie2_0 | 1368 | 1368 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Hardswish_aie2_1 | 1522 | 1522 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| InstanceNormPart1_aie2_bf16_0 | 2916 | 2916 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| InstanceNormPart1_aie2_int8_0 | 11387 | 11387 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| InterpolateLinear1D_AIE2_int8 | 11967 | 11967 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| LayerNormC8Part1_aie2_bf16_0 | 8962 | 8962 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| LayerNormC8Part1_aie2_int8_0 | 7830 | 7830 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| LayerNormC8Part2_aie2_int8_0 | 11222 | 11222 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Log_bf16_0 | 4149 | 4149 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Log_int8_0 | 1329 | 1329 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| LogicalNot_aie2_0 | 225 | 225 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| LogicalXor_aie2_int8 | 528 | 528 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| MaxPool2D_bf16_0 | 1797 | 1797 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| MaxPool2D_bf16_1 | 1269 | 1269 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Mod_aie2_bf16 | 5246 | 5246 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Mul2d_bf16_0 | 519 | 519 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Mul2d_bf16_1 | 327 | 327 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| MulAttributeBroadcasting_aie2_int8_0 | 517 | 517 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| MulBf16_aie2_0 | 697 | 697 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| MulBroadcastingBf16_aie2_0 | 752 | 752 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| MulBroadcasting_aie2_0 | 294 | 294 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Mul_aie2_0 | 231 | 231 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Neg_aie2_0 | 779 | 779 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Neg_aie2_1 | 455 | 455 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Pad2D_0 | 568 | 568 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Pad2D_1 | 1684 | 1684 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Pad2D_bf16_0 | 2394 | 2394 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| PixelShuffle_aie2_bf16 | 8566 | 8566 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| PixelShuffle_aie2_int8 | 7280 | 7280 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| PixelUnshuffle_bf16_0 | 17143 | 17143 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| PixelUnshuffle_int8_0 | 14571 | 14571 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Range_int8_aie2_0 | 1224 | 1224 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Range_int8_aie2_1 | 1846 | 1846 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Reciprocal_aie2_0 | 1231 | 1231 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Reciprocal_aie2_1 | 2155 | 2155 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceMeanAxis_7_aie2_int8 | 2255 | 2255 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceMin1D_aie2_bf16 | 188 | 188 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceMin1D_aie2_int8 | 164 | 164 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceSumAxis_1_aie2_int8 | 6903 | 6903 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceSumAxis_2_aie2_int8 | 6943 | 6943 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceSumAxis_3_aie2_int8 | 2921 | 2921 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceSumAxis_4_aie2_int8 | 6966 | 6966 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceSumAxis_5_aie2_int8 | 2924 | 2924 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceSumAxis_6_aie2_int8 | 2897 | 2897 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceSumAxis_7_aie2_bf16 | 6212 | 6212 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Requantize_0 | 1421 | 1421 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Requantize_1 | 781 | 781 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Rescale_aie2_int8_0 | 233 | 233 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Round_aie2_0 | 367 | 367 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Round_aie2_1 | 1092 | 1092 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Rsqrt_aie2_bf16_0 | 3602 | 3602 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Rsqrt_aie2_int8_0 | 2376 | 2376 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Select_aie2_bf16 | 299 | 299 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Select_aie2_int8 | 206 | 206 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Shrink_aie2_0 | 658 | 658 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Shrink_aie2_1 | 759 | 759 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| SiLU_aie2_bf16 | 2908 | 2908 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| SiLU_aie2_int8 | 2969 | 2969 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| SiLU_aie2_int8_1 | 2967 | 2967 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| SigmoidTemplated_bf16_0 | 1633 | 1633 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| SigmoidTemplated_int8_0 | 1276 | 1276 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| SigmoidTemplated_int8_1 | 1276 | 1276 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Sigmoid_bf16_0 | 2627 | 2627 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Sigmoid_bf16_1 | 1727 | 1727 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Sigmoid_int8_0 | 91 | 91 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Sigmoid_int8_1 | 110 | 110 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Sign_bf16_0 | 1078 | 1078 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Sign_bf16_1 | 210 | 210 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Sign_int8_0 | 416 | 416 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Sign_int8_1 | 122 | 122 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Sin_aie2_bf16 | 3014 | 3014 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Sin_aie2_int8 | 842 | 842 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Sqrt_bf16_0 | 29777 | 29777 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Sqrt_bf16_1 | 3793 | 3793 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Sqrt_int8_0 | 19162 | 19162 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Sqrt_int8_1 | 19162 | 19162 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Squeeze_bfloat16_0 | 207 | 207 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Squeeze_int8_0 | 207 | 207 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| SubAttributeBroadcasting_aie2_int8_0 | 807 | 807 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| SubBroadcasting_aie2_bf16_0 | 706 | 706 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| SubBroadcasting_aie2_int8_0 | 754 | 754 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| SubBroadcasting_aie2_int8_0_ptr_interface | 754 | 754 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Sub_aie2_bf16_0 | 651 | 651 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Sub_aie2_int8_0 | 704 | 704 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Sub_aie2_int8_0_ptr_interface | 704 | 704 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| TanhTemplated_aie2_bfloat16 | 1049 | 1049 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| TanhTemplated_aie2_int8 | 300 | 300 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Tanh_0 | 1970 | 1970 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Tanh_1 | 2578 | 2578 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Tanh_int8_0 | 339 | 339 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Tanh_int8_1 | 407 | 407 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ThresholdedRelu_aie2_bfloat16 | 514 | 514 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ThresholdedRelu_aie2_int8 | 865 | 865 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Topk1D_bf16_0 | 1217 | 1217 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Topk1D_bf16_1 | 169 | 169 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Topk1D_int8_0 | 766 | 766 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Topk1D_int8_1 | 118 | 118 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Topk2D_bf16_0 | 34469 | 34469 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Topk2D_bf16_1 | 303 | 303 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Topk2D_int8_0 | 28803 | 28803 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Topk2D_int8_1 | 248 | 248 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Transpose_aie2_bf16_021 | 1856 | 1856 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Transpose_aie2_bf16_021_pad | 2338 | 2338 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Transpose_aie2_bf16_102 | 1155 | 1155 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Transpose_aie2_bf16_102_pad | 1140 | 1140 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Transpose_aie2_bf16_120 | 1856 | 1856 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Transpose_aie2_bf16_120_pad | 1752 | 1752 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Transpose_aie2_bf16_201 | 1871 | 1871 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Transpose_aie2_bf16_201_pad | 1767 | 1767 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Transpose_aie2_bf16_210 | 1868 | 1868 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Transpose_aie2_bf16_210_pad | 1868 | 1868 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Transpose_aie2_int8_021 | 2685 | 2685 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Transpose_aie2_int8_021_pad | 3612 | 3612 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Transpose_aie2_int8_102 | 1149 | 1149 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Transpose_aie2_int8_102_pad | 1089 | 1089 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Transpose_aie2_int8_120 | 2686 | 2686 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Transpose_aie2_int8_120_pad | 2686 | 2686 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Transpose_aie2_int8_201 | 2700 | 2700 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Transpose_aie2_int8_201_pad | 2544 | 2544 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Transpose_aie2_int8_210 | 2694 | 2694 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Transpose_aie2_int8_210_pad | 2538 | 2538 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| bfloat16 | 1217 | 1217 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| int8 | 847 | 847 | SAME(+0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_Transpose_AIE2_0 | 53845 | 53844 | SAME(-0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Pow_bf16_1 | 34196 | 34195 | SAME(-0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Pow_bf16_0 | 34190 | 34189 | SAME(-0.00%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_Transpose_AIE2_1 | 14441 | 14440 | SAME(-0.01%) |
|--------------------------------------------------------------|------------|---------|---------------|
| GEMM_int8_1 | 32931 | 32928 | SAME(-0.01%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_Transpose_bf16_AIE2_1 | 6292 | 6291 | SAME(-0.02%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Range_bfloat16_aie2_0 | 4065 | 4064 | SAME(-0.02%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_7x7s2_Layer1_0 | 5885 | 5883 | SAME(-0.03%) |
|--------------------------------------------------------------|------------|---------|---------------|
| InstanceNormPart2_aie2_int8_0 | 11508 | 11504 | SAME(-0.03%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_mixed_batch_0 | 11094 | 11090 | SAME(-0.04%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Range_bfloat16_aie2_1 | 2669 | 2668 | SAME(-0.04%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceSumAxis_7_aie2_int8 | 2215 | 2214 | SAME(-0.05%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_11x11s4_Layer1_0 | 4274 | 4272 | SAME(-0.05%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_11x11s4_1 | 5418 | 5415 | SAME(-0.06%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceProdAxis_7_aie2_bf16 | 1795 | 1794 | SAME(-0.06%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_11x11s4_Layer1_1 | 2979 | 2977 | SAME(-0.07%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_ReLU_Standalone_1 | 2533 | 2531 | SAME(-0.08%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_FC_1 | 1144 | 1143 | SAME(-0.09%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceSum_bf16_0 | 12199 | 12187 | SAME(-0.10%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceSum_bf16_1 | 12199 | 12187 | SAME(-0.10%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_11x11s4_0 | 5785 | 5779 | IMPR(-0.10%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceSum_int8_1 | 11387 | 11375 | IMPR(-0.11%) |
|--------------------------------------------------------------|------------|---------|---------------|
| GEMM_int8_0 | 2797 | 2794 | IMPR(-0.11%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceSum_int8_0 | 19670 | 19646 | IMPR(-0.12%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_7x7s2_Layer1_1 | 1613 | 1611 | IMPR(-0.12%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceMax_bf16_1 | 9421 | 9409 | IMPR(-0.13%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceMin_bf16_1 | 18445 | 18421 | IMPR(-0.13%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_ReLU_1 | 27510 | 27473 | IMPR(-0.13%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_LReLU_1 | 5263 | 5255 | IMPR(-0.15%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_ReLU_0 | 1275 | 1273 | IMPR(-0.16%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_ReLU_Standalone_0 | 1275 | 1273 | IMPR(-0.16%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceMax_bf16_0 | 7193 | 7181 | IMPR(-0.17%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceMin_bf16_0 | 7193 | 7181 | IMPR(-0.17%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Mul2D_0 | 533 | 532 | IMPR(-0.19%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Mul2D_1 | 533 | 532 | IMPR(-0.19%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Mish_aie2_int8 | 9516 | 9494 | IMPR(-0.23%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_SV60 | 857 | 855 | IMPR(-0.23%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceMax_int8_1 | 19315 | 19267 | IMPR(-0.25%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceMin_int8_1 | 19069 | 19021 | IMPR(-0.25%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Slice_int8_0 | 1545 | 1541 | IMPR(-0.26%) |
|--------------------------------------------------------------|------------|---------|---------------|
| PowAttributeBroadcasting_aie2_bf16_0 | 40590 | 40462 | IMPR(-0.32%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceMax_int8_0 | 14509 | 14461 | IMPR(-0.33%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_mixed_batch_1 | 21518 | 21444 | IMPR(-0.34%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_LReLU_0 | 2175 | 2167 | IMPR(-0.37%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Slice_bfloat16_0 | 945 | 941 | IMPR(-0.42%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Conv2D_0 | 7694 | 7657 | IMPR(-0.48%) |
|--------------------------------------------------------------|------------|---------|---------------|
| ReduceMin_int8_0 | 8797 | 8749 | IMPR(-0.55%) |
|--------------------------------------------------------------|------------|---------|---------------|
| FullyConnect_aie2_bf16 | 1090 | 1083 | IMPR(-0.64%) |
|--------------------------------------------------------------|------------|---------|---------------|
| BatchNorm2D_1 | 416 | 413 | IMPR(-0.72%) |
|--------------------------------------------------------------|------------|---------|---------------|
| BatchNorm1d_aie2_bfloat16 | 390 | 387 | IMPR(-0.77%) |
|--------------------------------------------------------------|------------|---------|---------------|
| DilatedConv2D_1 | 5390 | 5347 | IMPR(-0.80%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Add2D_Standalone_1 | 482 | 478 | IMPR(-0.83%) |
|--------------------------------------------------------------|------------|---------|---------------|
| BatchNorm1d_aie2_int8 | 408 | 404 | IMPR(-0.98%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Scale_Add_bf16_0 | 1709 | 1690 | IMPR(-1.11%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Scale_Add_bf16_1 | 1709 | 1690 | IMPR(-1.11%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Add2D_Standalone_0 | 322 | 318 | IMPR(-1.24%) |
|--------------------------------------------------------------|------------|---------|---------------|
| BatchNorm2D_0 | 308 | 304 | IMPR(-1.30%) |
|--------------------------------------------------------------|------------|---------|---------------|
| FullyConnect_aie2_int8 | 829 | 817 | IMPR(-1.45%) |
|--------------------------------------------------------------|------------|---------|---------------|
| GEMV_0 | 469 | 461 | IMPR(-1.71%) |
|--------------------------------------------------------------|------------|---------|---------------|
| GroupG8_aie2_int8_1 | 907 | 891 | IMPR(-1.76%) |
|--------------------------------------------------------------|------------|---------|---------------|
| GroupG8_aie2_bf16_1 | 1691 | 1659 | IMPR(-1.89%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Elu_aie2_int8_0 | 589 | 577 | IMPR(-2.04%) |
|--------------------------------------------------------------|------------|---------|---------------|
| GEMV_1 | 387 | 379 | IMPR(-2.07%) |
|--------------------------------------------------------------|------------|---------|---------------|
| GEMM_bf16_0 | 3622 | 3545 | IMPR(-2.13%) |
|--------------------------------------------------------------|------------|---------|---------------|
| PowAttributeBroadcasting_aie2_int8_0 | 4309 | 4210 | IMPR(-2.30%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Pow_int8_0 | 4309 | 4210 | IMPR(-2.30%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Expand_aie2_bfloat16 | 1944 | 1881 | IMPR(-3.24%) |
|--------------------------------------------------------------|------------|---------|---------------|
| GEMM_bf16_1 | 7669 | 7408 | IMPR(-3.40%) |
|--------------------------------------------------------------|------------|---------|---------------|
| GroupG4_aie2_int8_1 | 860 | 828 | IMPR(-3.72%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Elu_aie2_bf16_0 | 2709 | 2603 | IMPR(-3.91%) |
|--------------------------------------------------------------|------------|---------|---------------|
| GroupG4_aie2_bf16_1 | 1596 | 1532 | IMPR(-4.01%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Mish_aie2_bfloat16 | 5475 | 5224 | IMPR(-4.58%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Softmax_bf16_1 | 1583 | 1510 | IMPR(-4.61%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Exp_bf16_1 | 1227 | 1156 | IMPR(-5.79%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Scale_Add_0 | 374 | 351 | IMPR(-6.15%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Scale_Add_1 | 374 | 351 | IMPR(-6.15%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Add2D_bf16_1 | 298 | 274 | IMPR(-8.05%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Tile_aie2_bf16_0 | 4248 | 3897 | IMPR(-8.26%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Softmax_bf16_0 | 6350 | 5784 | IMPR(-8.91%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Exp_bf16_0 | 6047 | 5480 | IMPR(-9.38%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Add2D_bf16_0 | 254 | 230 | IMPR(-9.45%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Softmax_1 | 425 | 384 | IMPR(-9.65%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Expand_aie2_int8 | 1891 | 1570 | IMPR(-16.98%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Tile_aie2_int8_1 | 2579 | 1906 | IMPR(-26.10%) |
|--------------------------------------------------------------|------------|---------|---------------|
| Averege diff | | -0.39% | -0.39% |
|--------------------------------------------------------------|------------|---------|---------------|
| Diff stdev | | 2.48 | 2.48 |
|--------------------------------------------------------------|------------|---------|---------------|
| Quantile #1 | | -0.79% | -0.79% |
|--------------------------------------------------------------|------------|---------|---------------|
| Quantile #2 | | -0.07% | -0.07% |
|--------------------------------------------------------------|------------|---------|---------------|
| Quantile #3 | | +0.00% | +0.00% |
|--------------------------------------------------------------|------------|---------|---------------|
| Quantile #4 | | +0.00% | +0.00% |
|--------------------------------------------------------------|------------|---------|---------------|
| Quantile #5 | | +0.00% | +0.00% |
|--------------------------------------------------------------|------------|---------|---------------|
| Quantile #6 | | +0.00% | +0.00% |
|--------------------------------------------------------------|------------|---------|---------------|
| Quantile #7 | | +0.00% | +0.00% |
|--------------------------------------------------------------|------------|---------|---------------|
| Quantile #8 | | +0.00% | +0.00% |
|--------------------------------------------------------------|------------|---------|---------------|
| Quantile #9 | | +0.12% | +0.12% |
|--------------------------------------------------------------|------------|---------|---------------|
For Conv2D_FC_0
, ACQ was moved (in _main) to de delay slot of the function call to conv2d_wrapper. In this way, the wait cycles are not accounted to _main
, but to conv2d_wrapper
. Apart of this effect, we improve in cycle count for this benchmark as well.
For Floor_aie2_0
, we increase final II (pre-swp), but we should disable unrolling and let post-swp do the job.
GEMM_bf16_0/1
is performing in 17 cycles (pre-swp for now).
PM Size effect: -0.09% (basically unaffected).
Regarding the Conv2D_FC_0
regression, @mludevid and @katerynamuts are looking into refining our modelling of semaphores. That might get solved. I think we do not keep enough distance between semaphores and the end of regions.
Hi @gbossu, all your comments were addressed. Thank you very much!