Open florianl opened 1 month ago
Related Issues and Documentation
(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)
This is a very intriguing type of profile I’ve never heard of before.
Do you intend that this profile would work the same way as the linked perf profile type? That is, using a precise “memory access” (or memory load) hardware PMU metric.
Aside: I found the patch message to be the most straightforward and concise summary of how that profile works: https://lwn.net/Articles/954938/
Along those lines, do you know if the existing perf profile works on Go programs? I don’t see fundamental reasons it shouldn’t, but we may be missing some DWARF. So even if we don’t add a profile to runtime/pprof, fixing up problems with perf profiles may be doable.
cc @golang/runtime
Do you intend that this profile would work the same way as the linked perf profile type? That is, using a precise “memory access” (or memory load) hardware PMU metric.
Implementing this new profile based on PMU metrics would benefit accuracy, I think. I'm missing Go runtime internal knowledge to tell whether there is an option implementing it without PMU metrics.
Along those lines, do you know if the existing perf profile works on Go programs?
I'm using perf whenever it is available and so far I didn't run into issues or did miss some information when profiling Go executables. The given example of struct runtime.mspan
in the initial post of this proposal was generated by perf.
To my knowledge, perf is not available on every OS, e.g. I'm not aware of perf on windows. Also perf is often not deployed to production systems. Therefore, the Go ecosystem would benefit from insights of this new profile if it is integrated natively.
That's great to hear that the perf tool seems to work well.
Proposal Details
Proposal Details
With field reordering and padding of structs, static analysis can help to improve memory layouts of Go structs. This can lead to a more efficient way to access struct fields, as the fields within the struct are aligned to some degree. Combined with dead code analysis, unused fields in structs can be identified by static analysis and help to reduce the size of structs.
This proposal tries to introduce the ideas from Data-type profiling for perf to Go's pprof ecosystem to provide a Go native approach. Today it is already possible with perf on Unix systems to do data-type profiling, reorder structs accordingly and benefit from the performance improvements.
Introduce a new runtime/pprof Profile that tracks the number read/write accesses of fields within a Go struct.
The report of this new runtime/pprof Profile should enable users to identify often used fields within a struct, in order to reorder struct fields to improve memory efficiency of their application.
Example reporting of for a Go struct generated by the approach described in Data-type profiling for perf:
The above shown example reports the field access of the Go internal struct mspan while running the benchmarks in net/http with
go version devel go1.24-eb6f2c24cd Sat Sep 28 01:07:09 2024 +0000 linux/amd64
.Alternative
Instead of introducing a new runtime/pprof Profile, a similar approach to go build -cover could be used. During build time access to fields in Go structs could be instrumented and a report should be generated when executing the resulting Go binary. The resulting report then can be used by
go tool cover
to report the number of times a field in a struct was accessed.Question
I'm lacking Go runtime internal knowledge to provide a proof of concept with this proposal.