Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Binary identical template instantiations are not merged #11795

Open Quuxplusone opened 12 years ago

Quuxplusone commented 12 years ago
Bugzilla Link PR11633
Status NEW
Importance P enhancement
Reported by Domagoj Šarić (dsaritz@gmail.com)
Reported on 2011-12-21 02:35:38 -0800
Last modified on 2013-09-25 06:57:42 -0700
Version trunk
Hardware Macintosh MacOS X
CC anton@korobeynikov.info, bob.wilson@apple.com, dsaritz@gmail.com, efriedma@quicinc.com, llvm-bugs@lists.llvm.org, nicholas@mxc.ca, rafael@espindo.la, rjmccall@apple.com
Fixed by commit(s)
Attachments source1.cpp (458 bytes, application/octet-stream)
test.html (18330 bytes, text/html)
Blocks
Blocked by
See also
Created attachment 7782
testcase

Function templates instantiated for different integral types that produce
identical code are not merged into a single function.
In the attached example this is demonstrated for int and an enum type.
Codegen produced with clang -Os -std=gnu++0x -flto -m32 -lstdc++ -ffast-math.

Sometimes clang even produces very slightly different code for example for
unsigned int and enum instantiations of certain functions for which MSVC
generates identical code and merges the functions.
Quuxplusone commented 12 years ago

Attached source1.cpp (458 bytes, application/octet-stream): testcase

Quuxplusone commented 12 years ago

Attached test.html (18330 bytes, text/html): codegen

Quuxplusone commented 12 years ago

Distinct functions are required to have distinct addresses. I'm not surprised that MSVC breaks that rule, though.

Quuxplusone commented 12 years ago
(In reply to comment #2)
> Distinct functions are required to have distinct addresses.

Why is that relevant?  The given testcase doesn't take the address of either
function.  In this example, it is legal to change the linkage of the given
functions from "linkonce_odr" to "internal" during LTO.
Quuxplusone commented 12 years ago

Sure. It's a legal optimization in this instance. It is probably best left as a link-time optimization, though.

Quuxplusone commented 12 years ago
We have a -mergefunc pass, but it is not on by default.

In this testcase it produces:

define linkonce_odr i32 @_Z13test_functionI12IntegralTypeEiT_f(i32, float)
nounwind uwtable readnone optsize noinline {
  %3 = tail call i32 @_Z13test_functionIiEiT_f(i32 %0, float %1)
  ret i32 %3
}

Which is probably optimal since we don't know if another TU depends on the
function addresses or not.
Quuxplusone commented 12 years ago
(In reply to comment #4)
> Sure.  It's a legal optimization in this instance.  It is probably best left
as
> a link-time optimization, though.

Of course. That's what MSVC's linker does when you specify /OPT:ICF. In my
limited knowledge I imagine "Identical COMDAT Folding" to be a rather simple
task, simply sort all objects by size and then do a memcmp on those that are
equal in size...
Quuxplusone commented 12 years ago
(In reply to comment #5)
> We have a -mergefunc pass, but it is not on by default.

Can it be turned on from the command line (with -mllvm or something)?

> In this testcase it produces:
>
> define linkonce_odr i32 @_Z13test_functionI12IntegralTypeEiT_f(i32, float)
> nounwind uwtable readnone optsize noinline {
>   %3 = tail call i32 @_Z13test_functionIiEiT_f(i32 %0, float %1)
>   ret i32 %3
> }
>
> Which is probably optimal since we don't know if another TU depends on the
> function addresses or not.

Considering that -flto was specified, shouldn't it all be "just one big happy
TU"?
Quuxplusone commented 12 years ago

Also, would -mergefunc help with other similar cases? For example, Clang also doesn't merge virtual function implementations that are exactly the same (e.g. why you have a bunch of completely empty or { return E_NOTIMPL; } like functions)...

Quuxplusone commented 12 years ago

There are probably other cases where it helps; the reason we don't use it by default is that nobody has really had the time to spend to make it production-quality.

Quuxplusone commented 11 years ago
bump/any news/developments on this?
...IMO it is rather sad to have such a deficiency (creating bloated templated
code) "in this day and age" "with all this talk of modern C++ support"...