llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.75k stars 11.89k forks source link

Global is external, but doesn't have external or weak linkage when linking with ThinLTO #37394

Closed llvmbot closed 6 years ago

llvmbot commented 6 years ago
Bugzilla Link 38046
Resolution FIXED
Resolved on Jul 27, 2018 15:44
Version unspecified
OS Linux
Reporter LLVM Bugzilla Contributor

Extended Description

Error message: Global is external, but doesn't have external or weak linkage! [23 x i8]* @"??_C@_0BH@KGIBAJIK@Off?9heap?5backing?5store?$AA@" LLVM ERROR: Broken module found, compilation aborted!

How to Reproduce:

Configure Chromium with args.gn: clang_use_chrome_plugins = false is_clang = true is_component_build = false is_debug = false is_official_build = true use_lld = true use_thin_lto = true symbol_level = 2

Then try to build d8: ninja -C out/thinlto d8

This will result in the error message shown earlier.

With symbol_level = 0, the error does not occur.

llvmbot commented 6 years ago

Fixed in r338100.

llvmbot commented 6 years ago

Some more analysis here:

What is happening is that we have a global whose body we don't need anymore as far as the code is concerned, but that is still referenced by the debug information. When LTO/LTO.cpp invokes the IRMover, we tell it not to link that global. Metadata materialization will find a reference to the global, and we materialize a declaration for it. Later, we internalize that declaration and get into an invalid state, because declarations cannot be internal.

I tried a couple of different approaches, and eventually settled on letting IRMover create the declarations as it would in other cases (e.g. when moving a declaration from the source module), and then handling the case in the LTO code, which amounts to just skipping them when internalizing. This prevents internalize from emitting a broken module, whereas the code generated in the opt stage of LTO is the same with or without the change (so we're not generating worse code).

Patch at https://reviews.llvm.org/D49777

llvmbot commented 6 years ago

I think the way we get to this error message is:

  1. We determine that the GV is dead, which means we don't keep the definition.

  2. Because there is still a reference to the GV, we materialize a declaration.

  3. Internalize then changes the linkage of the declaration from external to internal.

  4. The verifier checks that declarations have external or weak external linkage - but this declaration doesn't, hence the error message.

Although the error message points out that the declaration has an invalid linkage type, I think the real problem here is that the code that determined that the GV is dead and the code that decided we need to copy the GV do not agree with each other (if it were really dead, we wouldn't need to copy it). We have two options:

  1. Keep the GV alive if there is a metadata reference to it.

  2. Remove metadata references to the GV.

llvmbot commented 6 years ago

I used creduce to reduce the relevant C++ code. I'm sure this can be reduced further, but posting the results before I head home for the day:

serializer.ii: class a { public: void b(int); }; namespace v8 { namespace internal { class c { public: c() {} }; class C { public: virtual ~C(); }; class DefaultSerializerAllocator { public: void d(); }; class e { public: char f; void g(char ) { h.b(f); } a h; }; template class Serializer { class ObjectSerializer; i allocator(); }; template class Serializer::ObjectSerializer : C { public: void Serialize(); Serializer *j; e k; }; class l { public: c d() {} }; template void Serializer::ObjectSerializer::Serialize() { c m; k.g("Off-heap backing store"); j->allocator()->d(); } template class Serializer; template class Serializer<>; } }

d8.ii: namespace v8 { template class Local {}; class UnboundScript; class ScriptCompiler { public: struct CachedData; static CachedData *CreateCodeCache(Local unbound_script); }; } int main() { v8::Local script; v8::ScriptCompiler::CreateCodeCache(script); }

llvmbot commented 6 years ago

My understanding is that the failure occurs when:

  1. There is a string literal that is passed to a call that is originally not inlined.

  2. (Probably necessary) There is a @​llvm.dbg.value referencing that same string literal elsewhere.

  3. Later, we decide to inline the remaining call that used the string literal, and, since the function does not use the string, we discard it.

  4. Although the definition of the string literal has now been discarded, the debug info still references it, and so we end up with a broken module.

I tried to write a small reproducer for this case, but haven't managed to get the error to occur yet.

llvmbot commented 6 years ago

I noticed that none of the string constants being passed to SnapshotByteSink::Put() in serializer.cc seem to have made it into serializer.obj.

It also turns out that the call that passes "Off-heap backing store" is the only call listed in the disassembly of serializer.obj. In other functions, e.g. VisitPointers(), the calls to Put() are inlined in serializer.obj.

Specifically, the call is in @"?SerializeBackingStore@ObjectSerializer@?$Serializer@VBuiltinSerializerAllocator@internal@v8@@@internal@v8@@AEAAHPEAXH@Z". There is a @​llvm.dbg.declare for the string constant in @"?SerializeBackingStore@ObjectSerializer@?$Serializer@VDefaultSerializerAllocator@internal@v8@@@internal@v8@@AEAAHPEAXH@Z". These are two different template specializations of the same function, one with a template parameter of BuiltinSerializerAllocator, the other with DefaultSerializerAllocator.

The specialization with DefaultSerializerAllocator (and the @​llvm.dbg.declare) survives into d8.exe.0.0.preopt.ll. The other specialization does not. None of the functions specialized on BuiltinSerializerAllocator do.

Following the metadata, there are other sites where calls to Put() are inlined. However, their metadata declarations look different. For example:

call void @​llvm.dbg.value(metadata !​32, metadata !​87010, metadata !DIExpression()) #​7, !dbg !​112315

Here, !​32 is !{}, !​87010 is describes the "description" parameter to Put(), and !​112315 is "Put() inlined at ".

By contrast, the declaration in @"?SerializeBackingStore@ObjectSerializer@?$Serializer@VDefaultSerializerAllocator@internal@v8@@@internal@v8@@AEAAHPEAXH@Z" looks like:

call void @​llvm.dbg.value(metadata i8 getelementptr inbounds ([23 x i8], [23 x i8] @"??_C@_0BH@KGIBAJIK@Off?9heap?5backing?5store?$AA@", i64 0, i64 0), metadata !​87010, metadata !DIExpression()) #​7, !dbg !​110850

Metadata !​87010 is as above ("description" parameter to Put()) and !​110850 is "Put() inlined at ".

The difference is that this metadata declaration actually references the string constant, whereas the others don't.

I'm not sure why the compiler emits the string literal in some cases and elides it in others. But it seems like something that the linker should be able to work with either way.

llvmbot commented 6 years ago

The function call in question is SnapshotByteSink::Put(byte b, const char *description), is defined in-line, and ignores the description parameter.

I suspect this is why we optimize out the value we use for that parameter. But when generating debug info, we still generate an @​llvm.dbg.value for it (in d8.exe.0.0.preopt.ll):

call void @​llvm.dbg.value(metadata i8 getelementptr inbounds ([23 x i8], [23 x i8] @"??_C@_0BH@KGIBAJIK@Off?9heap?5backing?5store.?$AA@", i64 0, i64 0), metadata !​118244, metadata !DIExpression()) #​10, !dbg !​123481

llvmbot commented 6 years ago

This is a string constant with value "Off-heap backing store\0".

It is defined in chromium/src/v8/src/snapshot/serializer.cc as a string literal which is passed into a function call.

d8.exe.0.0.preopt.ll shows it as @"??_C@_0BH@KGIBAJIK@Off?9heap?5backing?5store?$AA@" = external dso_local unnamed_addr constant [23 x i8], align 1