andikleen / pmu-tools

Intel PMU profiling tools
GNU General Public License v2.0
1.98k stars 331 forks source link

Make #FP16 a tunable #445

Closed aayasin closed 11 months ago

aayasin commented 1 year ago

FP16 is an ExternalParameter introduced by TMA-4.5 to allow for better accuracy/reduced overhead for the default (legacy) HPC user that has not enabled his workload for AMX nor BF16. Currently extra events are collected blindly. This is a feature request to make a tunable. Default to set it to 1 (as current code) for the unknown workload on SPR. But user who seek better accuracy can use it to not collected needless yet expensive events.

andikleen commented 11 months ago

Added an option. It was always tunable on the model level by overriding FP16 with a suitable lambda

aayasin commented 11 months ago

Andi, I see you added an argument, not a flag, to toplev in: https://github.com/andikleen/pmu-tools/commit/af906e6bcbef8140eb05f6803b6f8f55d2028428

Note however, the SPR model's code returns 1, while the arg is a way to set it. Hence the user cannot clear it. Need to be addressed.

aayasin commented 11 months ago

Hmm... If we are adding an arg, I suspect making it an integer would be better.

For example we can build the vector/matrix support in incremental manner: 0: no vector 1: Legacy vector (DP, SP) 2: +HP Vector (FP16) 3: +AMX