Open Quuxplusone opened 12 years ago
Attached source1.cpp
(458 bytes, application/octet-stream): testcase
Attached test.html
(18330 bytes, text/html): codegen
Distinct functions are required to have distinct addresses. I'm not surprised that MSVC breaks that rule, though.
(In reply to comment #2)
> Distinct functions are required to have distinct addresses.
Why is that relevant? The given testcase doesn't take the address of either
function. In this example, it is legal to change the linkage of the given
functions from "linkonce_odr" to "internal" during LTO.
Sure. It's a legal optimization in this instance. It is probably best left as a link-time optimization, though.
We have a -mergefunc pass, but it is not on by default.
In this testcase it produces:
define linkonce_odr i32 @_Z13test_functionI12IntegralTypeEiT_f(i32, float)
nounwind uwtable readnone optsize noinline {
%3 = tail call i32 @_Z13test_functionIiEiT_f(i32 %0, float %1)
ret i32 %3
}
Which is probably optimal since we don't know if another TU depends on the
function addresses or not.
(In reply to comment #4)
> Sure. It's a legal optimization in this instance. It is probably best left
as
> a link-time optimization, though.
Of course. That's what MSVC's linker does when you specify /OPT:ICF. In my
limited knowledge I imagine "Identical COMDAT Folding" to be a rather simple
task, simply sort all objects by size and then do a memcmp on those that are
equal in size...
(In reply to comment #5)
> We have a -mergefunc pass, but it is not on by default.
Can it be turned on from the command line (with -mllvm or something)?
> In this testcase it produces:
>
> define linkonce_odr i32 @_Z13test_functionI12IntegralTypeEiT_f(i32, float)
> nounwind uwtable readnone optsize noinline {
> %3 = tail call i32 @_Z13test_functionIiEiT_f(i32 %0, float %1)
> ret i32 %3
> }
>
> Which is probably optimal since we don't know if another TU depends on the
> function addresses or not.
Considering that -flto was specified, shouldn't it all be "just one big happy
TU"?
Also, would -mergefunc help with other similar cases? For example, Clang also doesn't merge virtual function implementations that are exactly the same (e.g. why you have a bunch of completely empty or { return E_NOTIMPL; } like functions)...
There are probably other cases where it helps; the reason we don't use it by default is that nobody has really had the time to spend to make it production-quality.
bump/any news/developments on this?
...IMO it is rather sad to have such a deficiency (creating bloated templated
code) "in this day and age" "with all this talk of modern C++ support"...
source1.cpp
(458 bytes, application/octet-stream)test.html
(18330 bytes, text/html)