Open asl opened 1 day ago
@vlstill @fruffy @ChrisDodd
So, I tried to create PCH for IR headers (excluding ir-inline.h
). The times do look promising.
For midend/removeComplexExpressions.cpp
:
W/o PCH:
Executed in 4.12 secs fish external
usr time 3.91 secs 0.16 millis 3.91 secs
sys time 0.17 secs 8.69 millis 0.16 secs
With PCH:
Executed in 2.45 secs fish external
usr time 3.06 secs 0.11 millis 3.06 secs
sys time 0.85 secs 2.16 millis 0.85 secs
And the differences for ninja frontend
are (compilation is done in 10 threads):
Executed in 91.60 secs fish external
usr time 550.26 secs 0.06 millis 550.26 secs
sys time 20.98 secs 2.26 millis 20.97 secs
vs
Executed in 79.62 secs fish external
usr time 371.67 secs 0.07 millis 371.67 secs
sys time 17.33 secs 1.44 millis 17.33 secs
Here is full compilation after .def change (w/o Tofino of course):
w/o PCH:
Executed in 317.24 secs fish external
usr time 39.13 mins 0.10 millis 39.13 mins
sys time 1.66 mins 3.04 millis 1.66 mins
with PCH:
Executed in 247.53 secs fish external
usr time 28.21 mins 0.08 millis 28.21 mins
sys time 1.50 mins 1.76 millis 1.50 mins
So, the wall-time difference is "just" 30 seconds, but this is due to 10 cores used to build. Note the 25% reduction in usr
time, so on less beefy systems / VMs, the decrease in build time should more pronounced.
This is with clang on Mac OS, YMMV, need to check gcc on Linux :)
One important thing: if one is switching between the branches frequently and rely on ccache, then the overall compile time might increase as PCH is effectively forces re-compilation in such a case.
In case the compile-time impact is high, how does this change compare to enabling LTO in the build? What I remember from Tofino days, we had rather significant speedup by enabling LTO, presumably exactly because it allows inlining more functions. The advantage is that LTO can be enabled only for release builds so normal development build speeds are not affected.
On a slight tangent, there was an idea of speeding up compilation by using PCH (precompiled header in GCC/clang) for
ir-generated.h
. As far as I know, it was never tried and it was not completely clear if it was doable, but maybe it is worth investigating for the compilation speed (I don't think C++ modules will "save" us in in any reasonably close future giving their current state and more importantly the tool support state and requirements for P4C build).Originally posted by @vlstill in https://github.com/p4lang/p4c/issues/5030#issuecomment-2496961071