pulp-platform / snitch_cluster

An energy-efficient RISC-V floating-point compute cluster.
https://pulp-platform.github.io/snitch_cluster/
Apache License 2.0
52 stars 55 forks source link

Garbled performance metrics at end of annotated traces #85

Closed colluca closed 3 months ago

colluca commented 10 months ago

The trace output by gen_trace.py contains performance metrics at the end of the trace, e.g.:

## Performance metrics

Performance metrics for section 0 @ (517, 8635):
tstart                                   3760.0000
snitch_loads                                     7
snitch_stores                                   25
tend                                    11879.0000
fpss_loads                                       0
snitch_avg_load_latency                    79.4286
snitch_occupancy                            0.0299
snitch_fseq_rel_offloads                    0.1164
fseq_yield                                     1.0
fseq_fpu_yield                                 1.0
fpss_section_latency                             0
fpss_avg_fpu_latency                           2.0
fpss_avg_load_latency                            0
fpss_occupancy                              0.0039
fpss_fpu_occupancy                          0.0039
fpss_fpu_rel_occupancy                         1.0
cycles                                        8119
total_ipc                                   0.0339

When passing the trace through the annotate.py script, the performance metrics section is garbled:

                                      ##                                 Performance    tion 0 @ (517, 8635):
            tstart                  3760.0000
      snitch_loads                          7
      snitch_stores                         25
              tend                 11879.0000
        fpss_loads                          0
      snitch_avg_load_latency                    79.4286
      snitch_occupancy                     0.0299
      snitch_fseq_rel_offloads                     0.1164
        fseq_yield                        1.0
      fseq_fpu_yield                        1.0
      fpss_section_latency                          0
      fpss_avg_fpu_latency                        2.0
      fpss_avg_load_latency                          0
      fpss_occupancy                     0.0039
      fpss_fpu_occupancy                     0.0039
      fpss_fpu_rel_occupancy                        1.0
            cycles                       8119
         total_ipc                     0.0339

The annotate.py script should be extended to ignore the final part of the trace, preserving the performance metrics.