premake / premake-core

Premake
https://premake.github.io/
BSD 3-Clause "New" or "Revised" License
3.22k stars 620 forks source link

Link Time Code Generation #1183

Closed redorav closed 5 years ago

redorav commented 6 years ago

There is a flags option called "linktimeoptimization" that I use for release builds in my project. Looking at premake's code it seems like this maps well to all platforms. However in Visual Studio it doesn't turn on the Link Time Code Generation (/LTCG) option but instead leaves it blank which defaults to /LTCG:incremental. Unfortunately this is not the highest level of LTCG possible and leaves out even the most trivial of virtual function inlines. Would it make sense for this option to include that as well?

My other question is whether "linktimeoptimization" should be pulled out of flags and into its own api option.

catb0t commented 5 years ago

LTO and LTCG are very different things, and in GCC / Clang / most compilers requires only one option to the compiler to get effective LTO

LTCG (since VS 2015) requires the generation of optimisation profiles and call graphs as extra files, and requires multiple switches to enable.

I don't know about the VS dropdowns, but INCREMENTAL is the highest available argument to /LCTG in VS 2015 and later -- you need to use /USEPROFILE and /GENPROFILE to get the old effects.

Even before VS 2015 choosing the right argument to /LTCG becomes very complicated and doesn't seem like it should be auto-handled by a flags argument, I think you should use buildoptions with a filter and the right /GENPROFILE / /USEPROFILE flags (which will need two compilation passes I believe)

redorav commented 5 years ago

I see, thank you so much for taking the time to explain the difference between LTO and LTCG, I was not aware they were so different.

The options I have in VS2015 and VS017 however, do display LTCG:incremental and LTCG as different things, and checking the disassembly there is actually a significant difference, LTCG on its own is able to inline code better than incremental, and I don't seem to have to do anything other than remove "incremental" from the option. These are my options in VS2015, but 2017 is no different. For more information I'm using 2015 Update 3.

image

If like you say GGC and Clang actually enable LTO and LTCG as part of the same option, would it not make sense to enable them in Visual Studio as well?

catb0t commented 5 years ago

I left that comment partially unclear, sorry -- GCC and Clang accept -flto to enable LTO which is always an optimisation or has no effect; VS accepts a similar single argument for it.

GCC / Clang support LTCG but require generation / use of call graphs and optimisation profiles, usually in two separate steps, as VS needs.

I'm referring to the Microsoft docs; the PG* arguments are deprecated and seem to have no effect.

STATUS and NOSTATUS are just about displaying a progress bar and of course OFF isn't useful so (profiles aside) we're left with /LTCG:INCREMENTAL, which might be worth a flags argument.

In order to use the replacement of the PGO functions, first you need to compile with /GENPROFILE to create a .pgd database (but no object files), and recompile with /USEPROFILE providing the .pgd file.

In Premake you can't do this inside one project, you need one for generating and one to actually compile using the .pgd -- thus, trying to put the PGO higher-than-incremental steps into flags isn't quite possible

redorav commented 5 years ago

If I understand what you're saying and based on the documentation this is the way Visual Studio 2015+ works:

/LTCG:PGx -> No longer useful in VS2015+, and also related to PGO which is not what I'm looking for /LTCG:incremental -> Effectively does LTCG on objects incrementally as they change by edits /LTCG -> Does whole program optimization and seems to be incompatible with incremental linking

I'm not sure how exactly LTCG and LTCG:incremental differ other than I can see disassembly being different, or what their equivalents are in Clang/GCC. This came about for me while looking at ways to make the compiler inline virtual functions that live in a different translation unit when it knows the type of object it's dealing with. LTCG effectively does what I want and is able to inline, whereas /LTCG:incremental cannot do it. With this bit of context, do you know whether -flto has this behavior as well? I'm kind of expecting the same kind of optimizations but perhaps I don't understand enough to say, i.e. I'm not looking for PGO code generation which seems to be related to all this which confused me at first.

Edit: It seems like there is also an option in VS at the project level (not in the linker tab) to enable Whole Program Optimization to LTCG and then it doesn't matter what I set in the Linker tab, whether incremental or not it does the optimizations. I think what premake is doing via the linktimeoptimization option in flags is therefore correct.

catb0t commented 5 years ago

Sorry - I assumed because you wanted the higher levels of LTCG you wanted PGO!

Given your edit I don't think I can help you any more, but I was thinking that if you want PGO or to support LTCG in (VS / all compilers) in Premake, it would work well as a Premake module. Maybe I'll work on one...

redorav commented 5 years ago

Yes, thank you very much for taking the time. I think that premake is indeed doing what's expected and I was merely confused so I'll close this down. I've never done PGO so I'll leave the module to the experts ;)