mn416 / QPULib

Language and compiler for the Raspberry Pi GPU
Other
429 stars 64 forks source link

Add handling of Performance Counter registers #66

Closed wimrijnders closed 4 years ago

wimrijnders commented 6 years ago

This enables viewing and initialization of the performance registers in the RegisterMap.

These can be used for hardware profiling whern running kernels. Notably, they will be useful when optimizing the generated kernel code.

NOTE: #52 needs to be merged for this to work on @mn416 's system.


It took me a while to get this to work because the full functionality was not documented. I needed the Errata (#65) to get it to work; a single bit needed to be set for that.

wimrijnders commented 6 years ago

You can test this by running Rot3DLib. It should give output similar to following:

> make QPU=1 Rot3DLib
> sudo obj-qpu/bin/Rot3DLib
Running kernel 3
Enabled counters:
  Executing valid instructions      : 276060
  Stalled waiting for TMUs          : 968
  Level 2 cache hits                : 30
  Level 2 cache misses              : 2446
  QPU Instruction cache hits        : 69015
  QPU Instruction cache misses      : 44
  QPU cache hits                    : 108
  QPU cache misses                  : 8
  Idle                              : 67553764

0.001844s