Recently I checked Profile-Guided Optimization (PGO) improvements on multiple projects. The results are here. E.g. PGO results for LLVM-related tooling are here. According to the tests, PGO usually helps with the compiler and compiler-like workloads (as shown for Clang in its documentation). That's why I think trying to optimize the DPC compiler with PGO can be a good idea.
I can suggest the following action points:
Perform PGO benchmarks on the DPC compiler. And if it shows improvements - add a note about possible improvements in DPC's compiler performance with PGO.
Providing an easier way (e.g. a build option) to build scripts with PGO can be helpful for the end-users and maintainers since they will be able to optimize the DPC tooling according to their own workloads.
Optimize published pre-built binaries
Maybe testing Post-Link Optimization techniques (like LLVM BOLT) would be interesting too since the compilers like Clang and Rustc already use BOLT as an addition to PGO (Clang even already supports required build configurations).
If you already optimize the DPC binaries with PGO and/or BOLT, could you please share your performance optimization numbers (before and after applying PGO/BOLT)?
Hi!
Recently I checked Profile-Guided Optimization (PGO) improvements on multiple projects. The results are here. E.g. PGO results for LLVM-related tooling are here. According to the tests, PGO usually helps with the compiler and compiler-like workloads (as shown for Clang in its documentation). That's why I think trying to optimize the DPC compiler with PGO can be a good idea.
I can suggest the following action points:
Maybe testing Post-Link Optimization techniques (like LLVM BOLT) would be interesting too since the compilers like Clang and Rustc already use BOLT as an addition to PGO (Clang even already supports required build configurations).
If you already optimize the DPC binaries with PGO and/or BOLT, could you please share your performance optimization numbers (before and after applying PGO/BOLT)?
Thanks in advance!