tud-zih-energy / lo2s

Linux OTF2 Sampling - A Lightweight Node-Level Performance Monitoring Tool
https://tu-dresden.de/zih/forschung/projekte/lo2s?set_language=en
GNU General Public License v3.0
45 stars 13 forks source link

lbr call-stack support #242

Open tilsche opened 1 year ago

tilsche commented 1 year ago

With perf --call-graph lbr, has more reliable call stack support. And it should be fine on intel these days, apparently available since Haswell. However, it is (still) not supported by AMD processors (tested with Zen2/Rome).

cvonelm commented 1 year ago

From what I've read, we would ideally extend our existing --call-graph option so that --call-graph=lbr gives us LBR and --call-graph=fp (mirroring perfs nomenclature) would give us traditional call stack recording.

On supported platforms, LBR would be chosen as default (hopefully easily identifiable by trying to perf_event_open sampling with LBR), as there is no apparent downside to using LBR.

cvonelm commented 1 year ago

Branch issue-242-lbr-support contains a draft of support for last branch records.

Initial results look promising, with the samples providing perfect backtraces almost all of the time.

However, the distribution of samples appears weird. During a run of lo2s with FIRESTARTER as a payload, almost all samples came from inside of lo2s.

Wrt this weirdness and the undocumented perf behaviour black magic envolved in creating traces with LBRs, we should continue to keep frame pointers as the default sampling option.;