Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

ModuleLinker "chooses" the wrong DISubprogram when linkonce_odr is replaced by weak_odr #21907

Open Quuxplusone opened 9 years ago

Quuxplusone commented 9 years ago
Bugzilla Link PR21910
Status NEW
Importance P normal
Reported by Duncan (dexonsmith@apple.com)
Reported on 2014-12-13 13:18:14 -0800
Last modified on 2015-04-28 12:33:58 -0700
Version trunk
Hardware PC All
CC aprantl@apple.com, dblaikie@gmail.com, dexonsmith@apple.com, echristo@gmail.com, friss@apple.com, jsweval@arxan.com, llvm-bugs@lists.llvm.org, rafael@espindo.la
Fixed by commit(s)
Attachments repro.tar.gz (595 bytes, application/x-gzip)
Blocks
Blocked by
See also

Created attachment 13537 repro.tar.gz: demonstrate mismatched "first subprogram" vs. "chosen subprogram"

When a linkonce_odr function gets replaced by a weak_odr function, the new function body's !dbg references point at a potentially new DW_TAG_subprogram in a new DW_TAG_compile_unit.

Some algorithms assume the first DW_TAG_subprogram that points at an llvm::Function* is the canonical one, where the first is obtained by walking compile units in order, and within those, walking their subprograms. (Maybe it is?)

Although DW_TAG_subprograms usually de-dup, you can trivially create a discrepancy by having the weak_odr come from a compile_unit that was compiled in a different directory than the linkonce_odr, and linking in the weak_odr version second.

I think (?) the right result would be for the destination module to only contain one copy of the subprogram, and for it to be the one from the compile unit that had weak_odr linkage. It's not clear whether that's practical, though.

Attaching repro.tar.gz, which demonstrates how to create the discrepancy:

$ tar xzf repro.tar.gz $ cd repro/ $ for f in .h /*.cpp; do echo "// $f"; cat $f; done // t.h template struct Class { int foo() { return 0; } }; // d1/t1.cpp

include "t.h"

int foo() { return Class().foo(); } // d2/t2.cpp

include "t.h"

template struct Class; $ make cd d1 && clang -I.. -c -emit-llvm -gline-tables-only -o t1.bc t1.cpp cd d2 && clang -I.. -c -emit-llvm -gline-tables-only -o t2.bc t2.cpp llvm-link -S -o linked.ll d1/t1.bc d2/t2.bc llvm-link -o linked.bc d1/t1.bc d2/t2.bc $ grep -e '!dbg !20' linked.ll -B4 define weak_odr i32 @_ZN5ClassIiE3fooEv(%struct.Class %this) #1 align 2 { %1 = alloca %struct.Class, align 8 store %struct.Class %this, %struct.Class %1, align 8 %2 = load %struct.Class %1 ret i32 0, !dbg !20 $ grep -e '^!20 ' linked.ll !20 = metadata !{i32 2, i32 0, metadata !13, null} $ grep -e '@_ZN5ClassIiE3fooEv, ' linked.ll !7 = metadata !{i32 786478, metadata !8, metadata !9, metadata !"foo", metadata !"foo", metadata !"", i32 2, metadata !6, i1 false, i1 true, i32 0, i32 0, null, i32 256, i1 false, i32 (%struct.Class) @_ZN5ClassIiE3fooEv, null, null, metadata !2, i32 2} !13 = metadata !{i32 786478, metadata !14, metadata !15, metadata !"foo", metadata !"foo", metadata !"", i32 2, metadata !6, i1 false, i1 true, i32 0, i32 0, null, i32 256, i1 false, i32 (%struct.Class)* @_ZN5ClassIiE3fooEv, null, null, metadata !2, i32 2}

Quuxplusone commented 9 years ago

Attached repro.tar.gz (595 bytes, application/x-gzip): repro.tar.gz: demonstrate mismatched "first subprogram" vs. "chosen subprogram"

Quuxplusone commented 9 years ago

I just realized I never described why this discrepancy matters.

If -argpromotion (or another pass) decides to replace this function, it will update the "first" subprogram, and delete the old functions, so the real subprogram points at nullptr. The !dbg attachements will now point at a subprogram that does not point back at the llvm::Function.

This causes the subprograms to fail to DISubprogram::Verify(), and eventually hits assertions in LexicalScopes::initialize().

Quuxplusone commented 9 years ago

So there are a few options that leap out at me.

"Although DW_TAG_subprograms usually de-dup, you can trivially create a discrepancy by having the weak_odr come from a compile_unit that was compiled in a different directory than the linkonce_odr, and linking in the weak_odr version second.

I think (?) the right result would be for the destination module to only contain one copy of the subprogram, and for it to be the one from the compile unit that had weak_odr linkage. It's not clear whether that's practical, though."

I think that's probably worth trying - as that change would (substantially, I think) reduce the number of unique DebugLocs in a program anyway. Canonicalizing paths in the frontend. If we eventually reach a point where we can

Quuxplusone commented 9 years ago

(In reply to comment #2)

So there are a few options that leap out at me.

  • The first, you alluded to/described here:

"Although DW_TAG_subprograms usually de-dup, you can trivially create a discrepancy by having the weak_odr come from a compile_unit that was compiled in a different directory than the linkonce_odr, and linking in the weak_odr version second.

I think (?) the right result would be for the destination module to only contain one copy of the subprogram, and for it to be the one from the compile unit that had weak_odr linkage. It's not clear whether that's practical, though."

I think that's probably worth trying - as that change would (substantially, I think) reduce the number of unique DebugLocs in a program anyway. Canonicalizing paths in the frontend. If we eventually reach a point where we can

Bah - fat fingered tab... moving on.

If we eventually reach a point where we can't canonicalize some things (well, we need to support the possibility the two definitions are lexically identical, but were written in separate files - that's valid code, so we can't actually /fail/ on it), look at other options as well.

Other options/ideas?

Quuxplusone commented 9 years ago

(In reply to comment #3)

I think that's probably worth trying - as that change would (substantially, I think) reduce the number of unique DebugLocs in a program anyway. Canonicalizing paths in the frontend. If we eventually reach a point where we can

Bah - fat fingered tab... moving on.

If we eventually reach a point where we can't canonicalize some things (well, we need to support the possibility the two definitions are lexically identical, but were written in separate files - that's valid code, so we can't actually /fail/ on it), look at other options as well.

  • Forcibly canonicalize even when they're not metadata-identical - when replacing the linkonce_odr by weak_odr, go through and drop all the DISubprograms pointing to the linkonce_odr function being replaced.

  • Treat no definition as canonical - change the SubprogramMaps to map to a list of DISubprograms and update them all. Simple, works, not necessarily ideal because it's more work/redundancy in the representation.

Other options/ideas?

The obvious fix is OldSP.replaceAllUsesWith(NewSP). Too bad I just removed it that capability :/. (Really what we want is to remap all the old metadata to use the new subprogram. This isn't feasible though.)

I've just posted a patch that should unstick the Darwin Release+Asserts bootstrap. It seems like the wrong long-term fix to me, but I think it's not too slow, and will get the assertions passing. Although it doesn't kill the old subprogram entirely, it does update the "first subprogram" to match the "chosen subprogram".

There's another solution that does the right thing for us (well, short-term right thing) very efficiently, but it has side effects. The -argpromotion and -deadargelim users of makeSubprogramMap() need it in order to call DISubprogram::replaceFunction(), but instead of replacing the function in one DISubprogram, they could call:

ValueAsMetadata::handleRAUW(OldF, NewF);

Metadata doesn't care about types, so we can actually RAUW here -- this will update all the subprograms to point at the new function, instead of just the first one (returned by makeSubprogramMap()). However, this also updates all other metadata that might be pointing at the old function to point at the new. Maybe that's fine? (If so, this is a better stop-gap than the patch I just posted.)

Quuxplusone commented 9 years ago

I have a two thoughts for longer term:

  1. makeSubprogramMap() is a gross hack. Why don't we just attach metadata to the function?

    define void @foo() !subprogram !7 {
        ret void
    }

    This is a simple canonical place to find the "true" subprogram for a function. Of course, it requires new IR, but metadata attachments to functions should be trivial to implement (and won't have nearly the overhead of attachments to instructions). A few things to figure out: can they be attached to declarations? Do we handle arbitrary attachments, or just start with !subprogram? Are they in the same namespace as instruction attachments, or a new one?

  2. Once subprograms have a custom class -- as opposed to some generic MDNode -- we can update ModuleLinker/MapValue to have special handling for it. The ModuleLinker should change the old canonical subprogram to "match" the new one (update directory, variable nodes, etc.), and then when we're pulling in the new metadata, the ValueMapper should redirect all references to the updated old one.

Quuxplusone commented 9 years ago

I've committed r224389 for now.