Closed 5a8c1bbf-6a68-4414-a97a-a8d6571c030e closed 5 years ago
All is well: GUID collision handled: https://github.com/llvm/llvm-project/commit/cc0b9647b76178bc3869bbfff80535ad86366472
-main-file-name sets both debug info name and module name: https://github.com/llvm/llvm-project/commit/57dbfe194cb53b25246a40e8a73e5c77cc804619
Ok, so this issue has kind of split into two;
1) GUID hash collision causing crash in linker when using thinlto. I agree that making sure each module file has a informative source filename will greatly decrease a hash collision from occurring but I don't agree that the linker should error our if one occurs. Hashes can always collide, it's unlikely but with the low 64bit of a MD5 as the hash it doesn't take that many symbols for it to possibly happen. As a developer I'd be rather surprised if the linker errored out telling me I had to rename a symbol because the hash of that symbol (and its source file) happens to collide with some other symbol.
2) clang outputs source_filename = "-"
I found out why I haven't seen any debug info problems, icecc gives -Xclang -main-file-name -Xclang foo/bar.c as argument. So the debug info contains
So, to make things better right now I had the idea to use -main-file-name for more than just setting debug info source name. https://reviews.llvm.org/D67592
Ah, it seems clang does differently depending on -x value.
If source is:
clang -x c - -S -emit-llvm outputs source_filename = "-"
while clang -x cpp-output - -S -emit-llvm outputs source_filename = "foo.c"
Err, I mean the command that generates the sources to send ends with: -E ../../third_party/ffmpeg/libavutil/camellia.c -frewrite-includes
So the sources include:
but clang still writes source_filename = "-"
You're right, when disabling icecc I get source_filename = "../../third_party/ffmpeg/libavutil/camellia.c" instead of source_filename = "-"
The icecc command executed on remote machine ends with: -E ../../third_party/ffmpeg/libavutil/camellia.c -frewrite-includes
The fix you posted fixes a crash when GUIDs collide. However, if you want ThinLTO to work, you need GUIDs not to collide, so we should debug that, too. It seems reasonable to defend against crashing in the presence of GUID collision, but we may want to do that up front (reject inputs with multiple internal symbols with the same GUID).
In the crbug I see this:
The llvm::GlobalValue object that fails to cast to llvm::GlobalVariable is a llvm::Function named "F" (GV.getGlobalIdentifier() == "-:F") from third_party/ffmpeg/libavutil/camellia.c (GV.getParent().getModuleIdentifier() == "obj/third_party/ffmpeg/libffmpeg_internal.a(ffmpeg_internal/camellia.o at 48114)")
The "-:F" part is interesting. The global identifier is computed like this:
std::string GlobalValue::getGlobalIdentifier() const { return getGlobalIdentifier(getName(), getLinkage(), getParent()->getSourceFileName()); }
So, "-" is coming from Module::getSourceFileName(). That means that multiple modules with the internal symbol 'F' use "-" as the source file name. I suspect that this may because you are using some kind of compilation action wrapper (distcc, icecreamcc, etc) that pipes the source input to stdin. ThinLTO is assuming that the source file name will be unique, but such a compilation wrapper breaks that assumption. So, to fix this properly, we need to figure out how to move some kind of global module id through clang into the bitcode.
If the wrapper uses -E -frewrite-includes, I think clang actually already has all the information it needs. I'm pretty sure the line info directives tell the compiler the path to the main source file, and we could simply set the module source file with that info instead of using "-". If the wrapper uses just -E, then we're out of luck, and we'd need to invent some new pathway to move the source file path (or some other global id) from the build system through to the bitcode.
In the general case, it's possible to have collisions like this if the build system doesn't run the compiler in the same CWD. A build system could compile "foo.cpp" twice in two different directories and get the same GUID, so inventing a new way for the build system to explicitly distinguish modules would future proof ThinLTO for that use case.
I'd also mention that this isn't the only thing that will go awry with non-unique source file paths. In PDB files, types are identified by unique names. If you have two types with the same name in an anonymous namespace, they are distinguished by mangling in a hash of the source input path into the anon namespace component. If you use a wrapper like this and try to debug a linked program, only one type will be included in the PDB, resulting in a degraded debugging experience.
Potential fix: https://reviews.llvm.org/D67322
Extended Description
Happened upon a crash in lld when linking Chromium (see https://crbug.com/1000691).
Two global symbols (one function and one variable) from different modules end up with the same GUID.
When linking using thinlto this leads to a crash because code in FunctionImportUtils.cpp treats any GlobalValue with a GlobalVarSummary attached as a GlobalVariable but as both the symbols have the same GUID the global function also has the GlobalVarSummary attached.
Assert: ld.lld: chromium/src/third_party/llvm/llvm/include/llvm/Support/Casting.h:264: typename cast_retty<X, Y >::ret_type llvm::cast(Y ) [X = llvm::GlobalVariable, Y = llvm::GlobalValue]: Assertion `isa(Val) && "cast() argument of incompatible type!"' failed.
Top of stacktrace (full stacktrace attached):
4 0x000056267cb57788 in llvm::cast<llvm::GlobalVariable, llvm::GlobalValue> (Val=)
5 llvm::FunctionImportGlobalProcessing::processGlobalForThinLTO (this=0x7f7ec9d06720, GV=...)
6 0x000056267cb5792c in llvm::FunctionImportGlobalProcessing::processGlobalsForThinLTO (this=0x7f7ec9d06720)
7 0x000056267cb57bfd in llvm::FunctionImportGlobalProcessing::run (this=0x7f7ec9d06720)
8 llvm::renameModuleForThinLTO (M=..., Index=..., GlobalsToImport=)