russellallen / self

Making the world safe for objects
http://selflanguage.org
698 stars 75 forks source link

precompiled header rule misses build type flags #141

Open nbuwe opened 1 year ago

nbuwe commented 1 year ago

add_pch_rule in vm/cmake/functions.cmake doesn't correctly add build type specific flags to the rule that produces the precompiled header. this makes the compiler complain later on, when the precompiled header is used, that the flags used to create the header and the flags it is used with are different and that the precompiled header will be ignored.

The flags that are missing are stuff from COMPILE_DEFINITIONS_RELEASE and friends set in vm/cmake/configurations.cmake (like GENERATE_DEBUGGING_AIDS, etc)

I don't have any cmake-fu to fix this. I figure it needs gather_flags but my naive attemtps fail with "get_property could not find TARGET .../self/vm. Perhaps it has not yet been created."

nbuwe commented 1 year ago

As a side note, precompiled header that is forced to be included with -include on the command line makes #pragma interface in the source files invalid (hence the ifdef around them). this makes all inline functions from the headers to be emitted in every object file. For the RelWithDebInfo (i.e. optimized) this gives about 4MB Self binary, but the object files total 83MB of code, i.e. about 95% is duplicates.

If I don't fix up the precompiled header rule manually and so the precompiled header is not used, the slowdown is not too bad - about 4% user time (1% wall time in my -j4 build).

Would be interesting to test with proper include hygiene, but that requires a lot of menial changes, so not a quick test to try.

nbuwe commented 1 year ago

I have tried to compile with #pragma implementation in effect and it reduces object file sizes and compilation time quite a bit. On Linux with -j4 a RelWithDebInfo build with manually fixed pch rule goes:

$ tail -3 build-reldbg/.log
real    20m35.995s
user    72m48.945s
sys 2m45.777s

$ size build-reldbg/vm/Self
   text    data     bss     dec     hex filename
4287395   38344  151148 4476887  444fd7 build-reldbg/vm/Self

$ ll build-reldbg/vm/Self
-rwxrwxr-x 1 uwe uwe 373488596 Aug 12 01:54 build-reldbg/vm/Self*

$ find build-reldbg/vm/CMakeFiles/Self.dir -name '*.o' | xargs size -t | tail -1
87429670      36053  150638 87616361    538eb69 (TOTALS)

vs.

$ tail -3 build-reldbg-pragma/.log
real    6m18.492s
user    21m9.525s
sys 1m5.130s

$ size build-reldbg-pragma/vm/Self
   text    data     bss     dec     hex filename
4397634   38344  151148 4587126  45fe76 build-reldbg-pragma/vm/Self

$ ll build-reldbg-pragma/vm/Self
-rwxrwxr-x 1 uwe uwe 91423604 Aug 12 23:55 build-reldbg-pragma/vm/Self*

$ find build-reldbg-pragma/vm/CMakeFiles/Self.dir -name '*.o' | xargs size -t | tail -1
21322576      36053  150638 21509267    1483493 (TOTALS)

To do this test I have added a bunch of #include "_precompiled.hh" to the cpp files and dropped -include for it from the command line. This might be easier to do by including _precompiled.hh in the generated _<file>.cpp.incl files, but it would have taken me longer to figure out how to do that.

I haven't yet checked where the 110K text size increase comes from. But otherwise this looks like something worth pursuing.

nbuwe commented 1 year ago

Ah, the build sets -fkeep-inline-functions for gcc. And I used to wonder, why clang was so lightning fast... Dropping that flag brings the build time down to just a few minutes

real    1m49.889s
user    5m48.046s
sys 0m34.977s

and object files are now tiny.