Currently the CPU pprof profiles used for PGO can only break down samples to the source line level. PCs are also included and more precise, but without the original binary they aren't very useful.
This places artificial limitations on the accuracy of optimizations. e.g., if there are two calls to the same function on the same line, we cannot distinguish which one is hot and which one is cold.
This is even more problematic for some potential optimizations. e.g., basic block level optimizations need basic block weights, but there may be multiple basic blocks on the same line.
There are two obvious choices for this: column numbers or "discriminators".
Column numbers have the advantage of being intuitive and with a stable meaning, but have the downside that some constructs that we'd like to differentiate may still share a column number. e.g., I believe that bounds check comparison, success, and failure (panic) cases all share the same column number.
Discriminators have the advantage of flexibility. The compiler can assign them arbitrarily to every construct we care about. The main downside is potential instability. If two compiler versions generate different discriminator values, then profiles aren't fully compatible across the upgrade.
The pprof format itself does not support either column numbers or discriminators, so they will need to be added somehow.
cc @cherrymui @aclements @rajbarik @jinlin-bayarea
Currently the CPU pprof profiles used for PGO can only break down samples to the source line level. PCs are also included and more precise, but without the original binary they aren't very useful.
This places artificial limitations on the accuracy of optimizations. e.g., if there are two calls to the same function on the same line, we cannot distinguish which one is hot and which one is cold.
This is even more problematic for some potential optimizations. e.g., basic block level optimizations need basic block weights, but there may be multiple basic blocks on the same line.
There are two obvious choices for this: column numbers or "discriminators".
Column numbers have the advantage of being intuitive and with a stable meaning, but have the downside that some constructs that we'd like to differentiate may still share a column number. e.g., I believe that bounds check comparison, success, and failure (panic) cases all share the same column number.
Discriminators have the advantage of flexibility. The compiler can assign them arbitrarily to every construct we care about. The main downside is potential instability. If two compiler versions generate different discriminator values, then profiles aren't fully compatible across the upgrade.
The pprof format itself does not support either column numbers or discriminators, so they will need to be added somehow.
cc @cherrymui @aclements @rajbarik @jinlin-bayarea