RRZE-HPC / OSACA

Open Source Architecture Code Analyzer
GNU Affero General Public License v3.0
296 stars 18 forks source link

[BUG] Uppercase register names causes instructions to not match properly #109

Closed stefandesouza closed 3 days ago

stefandesouza commented 2 weeks ago

Describe the bug Using uppercase register names causes some instructions to not match properly, ultimately leading to a loss of throughput/latency information in the analysis output

To Reproduce
OSACA version 5.3.0
Used where CLI

Steps to reproduce the behavior:

OSACA output This is the output with uppercase registers with many missing latency matches.

Open Source Architecture Code Analyzer (OSACA) - 0.5.3
Analyzed file:      lbc.gh.nvfortran_upper.s
Architecture:       V2
Timestamp:          2024-09-17 09:51:27

 P - Throughput of LOAD operation can be hidden behind a past or future STORE instruction
 * - Instruction micro-ops not bound to a port
 X - No throughput/latency information for this instruction in data file

Combined Analysis Report
------------------------
                                                                         Port pressure in cycles                                                                         
     |  0   |  1   |  2   |  3   |  4   |  5   |   6   - 6DV  |  7   - 7DV  |  8   - 8DV  |  9   |  10  - 10DV |  11  |  12  |  13  |  14  |  15  |  16  ||  CP  | LCD  |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   2 |      |      |      |      |      |      |              |             |             |      |             |      |      |      |      |      |      ||      |      |   .L41f040
   3 |      |      |      |      |      |      |              |             |             |      |             |      |      |      |      |      |      ||      |      | X LDUR W13, [X30, #436]
   4 |      |      |      |      |      |      |              |             |             |      |             |      |      |      |      |      |      ||  0.0 |      | X ADD X6, X26, X15
   5 |      |      |      |      |      |      |              |             |             |      |             |      |      |      |      |      |      ||      |      | X ADD W15, W15, #1
   6 |      |      | 0.11 | 0.27 | 0.26 | 0.27 | 0.000        | 0.10        |             |      |             |      |      |      |      |      |      ||  1.0 |      |   SBFM X6, X6, #0, #31
   7 |      |      | 0.50 |      |      |      | 0.000        | 0.50        |             |      |             |      |      |      |      |      |      ||      |      |   CMP W15, W19
   8 |      |      |      |      |      |      |              |             |             |      |             |      |      |      |      |      |      ||  0.0 |      | X ADD X6, X6, #1
   9 |      |      |      |      |      |      | 1.000        |             |             |      |             |      |      |      |      |      |      ||  2.0 |      |   MADD X22, X6, X2, XZR
  10 |      |      |      |      |      |      |              |             |             |      |             |      |      |      |      |      |      ||      |      | X SUB W13, W9, W13
  11 |      |      | 0.20 | 0.21 | 0.20 | 0.20 | 0.000        | 0.20        |             |      |             |      |      |      |      |      |      ||      |      |   SBFM X13, X13, #0, #31
  12 |      |      |      |      |      |      | 1.000        |             |             |      |             |      |      |      |      |      |      ||  2.0 |      |   MADD X13, X1, X13, X22
  13 |      |      |      |      |      |      |              |             |             |      |             |      |      |      |      |      |      ||      |      | X LDR W22, [X30]
  14 |      |      |      |      |      |      |              |             |             |      |             |      |      |      |      |      |      ||      |      | X SUB W22, W25, W22
  15 |      |      | 0.21 | 0.21 | 0.18 | 0.21 | 0.000        | 0.20        |             |      |             |      |      |      |      |      |      ||      |      |   SBFM X22, X22, #0, #31
  16 |      |      |      |      |      |      | 1.000        |             |             |      |             |      |      |      |      |      |      ||  2.0 |      |   MADD X13, X0, X22, X13
  17 |      |      |      |      |      |      |              |             |             |      |             |      |      |      |      |      |      ||      |      | X LDR W22, [X30, #76]
  18 |      |      |      |      |      |      |              |             |             |      |             |      |      |      |      |      |      ||      |  0.0 | X ADD X30, X30, #4
  19 |      |      |      |      |      |      |              |             |             |      |             |      |      |      |      |      |      ||      |      | X SUB W22, W23, W22
  20 |      |      | 0.00 | 0.32 | 0.35 | 0.33 | -0.01        | 0.00        |             |      |             |      |      |      |      |      |      ||      |      |   SBFM X22, X22, #0, #31
  21 |      |      |      |      |      |      | 1.000        |             |             |      |             |      |      |      |      |      |      ||  2.0 |      |   MADD X13, X10, X22, x13
  22 |      |      |      |      |      |      | 1.000        |             |             |      |             |      |      |      |      |      |      ||  2.0 |      |   MADD X13, X13, X3, XZR
  23 |      |      |      |      |      |      |              |             |             |      |             |      |      |      |      |      |      ||  0.0 |      | X LDR D18, [X4, X13]
  24 |      |      |      |      |      |      |              |             |             |      |             |      |      |      |      |      |      ||      |      | X STR D18, [X14, X6,LSL #3]
  25 | 0.50 | 0.50 |      |      |      |      |              |             |             |      |             |      |      |      |      |      |      ||      |      |   B.LT .L41f040

------------------ WARNING: The performance data for 12 instructions is missing.------------------
                     No final analysis is given. If you want to ignore this
                     warning and run the analysis anyway, start osaca with
                                       --ignore-unknown flag.
--------------------------------------------------------------------------------------------------

Loop-Carried Dependencies Analysis Report
-----------------------------------------
  18 |  0.0 | ADD   X30, X30, #4                    | [18]
   5 |  0.0 | ADD   W15, W15, #1                    | [5]

Expected behavior It should function as below regardless of case. Since it only happens with some instructions the problem is probably while building the data cache for the HW model. But maybe a simple force to lowercase in the parser would make sense too.

Open Source Architecture Code Analyzer (OSACA) - 0.5.3
Analyzed file:      lbc.gh.nvfortran.s
Architecture:       V2
Timestamp:          2024-09-17 09:56:11

 P - Throughput of LOAD operation can be hidden behind a past or future STORE instruction
 * - Instruction micro-ops not bound to a port
 X - No throughput/latency information for this instruction in data file

Combined Analysis Report
------------------------
                                                                         Port pressure in cycles                                                                         
     |  0   |  1   |  2   |  3   |  4   |  5   |   6   - 6DV  |  7   - 7DV  |  8   - 8DV  |  9   |  10  - 10DV |  11  |  12  |  13  |  14  |  15  |  16  ||  CP  | LCD  |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   2 |      |      |      |      |      |      |              |             |             |      |             |      |      |      |      |      |      ||      |      |   .L41f040
   3 |      |      |      |      |      |      |              |             |             |      |             |      | 0.33 | 0.33 | 0.33 |      |      ||  4.0 |      |   LDUR w13, [x30, #436]
   4 |      |      | 0.15 | 0.18 | 0.25 | 0.26 | 0.000        | 0.17        |             |      |             |      |      |      |      |      |      ||      |      |   ADD x6, x26, x15
   5 |      |      | 0.35 | 0.32 |      |      | 0.000        | 0.33        |             |      |             |      |      |      |      |      |      ||      |      |   ADD w15, w15, #1
   6 |      |      | 0.06 | 0.24 | 0.31 | 0.31 | 0.000        | 0.09        |             |      |             |      |      |      |      |      |      ||      |      |   SBFM x6, x6, #0, #31
   7 |      |      | 0.51 |      |      |      | 0.000        | 0.49        |             |      |             |      |      |      |      |      |      ||      |      |   CMP w15, w19
   8 |      |      | 0.32 | 0.34 |      |      | 0.000        | 0.34        |             |      |             |      |      |      |      |      |      ||      |      |   ADD x6, x6, #1
   9 |      |      |      |      |      |      | 1.000        |             |             |      |             |      |      |      |      |      |      ||      |      |   MADD x22, x6, x2, XZR
  10 |      |      | 0.21 | 0.19 | 0.21 | 0.20 | 0.000        | 0.20        |             |      |             |      |      |      |      |      |      ||  1.0 |      |   SUB w13, w9, w13
  11 |      |      | 0.19 | 0.22 | 0.20 | 0.20 | 0.000        | 0.20        |             |      |             |      |      |      |      |      |      ||  1.0 |      |   SBFM x13, x13, #0, #31
  12 |      |      |      |      |      |      | 1.000        |             |             |      |             |      |      |      |      |      |      ||  2.0 |      |   MADD x13, x1, x13,x22
  13 |      |      |      |      |      |      |              |             |             |      |             |      | 0.33 | 0.33 | 0.33 |      |      ||      |      |   LDR w22, [x30]
  14 |      |      | 0.21 | 0.20 | 0.20 | 0.21 | 0.000        | 0.19        |             |      |             |      |      |      |      |      |      ||      |      |   SUB w22, w25, w22
  15 |      |      | 0.13 | 0.13 | 0.31 | 0.31 | 0.000        | 0.13        |             |      |             |      |      |      |      |      |      ||      |      |   SBFM x22, x22, #0, #31
  16 |      |      |      |      |      |      | 1.000        |             |             |      |             |      |      |      |      |      |      ||  2.0 |      |   MADD x13, x0, x22, x13
  17 |      |      |      |      |      |      |              |             |             |      |             |      | 0.33 | 0.33 | 0.33 |      |      ||      |      |   LDR w22, [x30, #76]
  18 |      |      | 0.28 | 0.44 |      |      | 0.000        | 0.28        |             |      |             |      |      |      |      |      |      ||      |  1.0 |   ADD x30, x30, #4
  19 |      |      | 0.00 | 0.16 | 0.42 | 0.42 | -0.01        | 0.00        |             |      |             |      |      |      |      |      |      ||      |      |   SUB w22, w23, w22
  20 |      |      | 0.00 | 0.00 | 0.50 | 0.50 | -0.01        | 0.00        |             |      |             |      |      |      |      |      |      ||      |      |   SBFM x22, x22, #0, #31
  21 |      |      |      |      |      |      | 1.000        |             |             |      |             |      |      |      |      |      |      ||  2.0 |      |   MADD x13, x10, x22, x13
  22 |      |      |      |      |      |      | 1.000        |             |             |      |             |      |      |      |      |      |      ||  2.0 |      |   MADD x13, x13, x3, XZR
  23 |      |      |      |      |      |      |              |             |             |      |             |      | 0.17 | 0.16 | 0.66 |      |      ||  4.0 |      |   LDR d18, [x4, x13]
  24 |      |      |      |      |      |      |              |             |             |      |             |      | 0.50 | 0.50 |      | 0.50 | 0.50 ||  0.0 |      |   STR d18, [x14, x6,LSL #3]
  25 | 0.50 | 0.50 |      |      |      |      |              |             |             |      |             |      |      |      |      |      |      ||      |      |   B.LT .L41f040

       0.50   0.50   2.39   2.40   2.40   2.40   4.980          2.40                                                    1.67   1.66   1.66   0.50   0.50    18.0    1.0  

Loop-Carried Dependencies Analysis Report
-----------------------------------------
  18 |  1.0 | ADD   x30, x30, #4                    | [18]
   5 |  1.0 | ADD   w15, w15, #1                    | [5]

Additional context This assembly snippet is from code compiled with the nvfortran compiler on a Grace Hopper machine