llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.64k stars 11.84k forks source link

Likely regression r289925: Various chromium tools have trouble symbolizing #30755

Open nico opened 7 years ago

nico commented 7 years ago
Bugzilla Link 31407
Version trunk
OS All
CC @adrian-prantl,@cmtice,@dwblaikie,@echristo,@lalozano,@markmentovai,@nico,@pogo59

Extended Description

Some of our tools have trouble symbolizing now (https://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_rel_ng/builds/357983); likely r289925. I'll try reverting, and I'll try to get you a repro.

Normally I'd get a repro first, but the llvm trunk state has been extremely unstable this week and I'd like to make sure it's stabilized now. It's possible we can just reland your patch without changes.

llvmbot commented 7 years ago

Great! Thank you. I think we are on the same page now. :)))

I have posted an updated phabricator code review a few days ago and I will commit this patch when it gets approved. https://reviews.llvm.org/D38002

cmtice commented 7 years ago

Yes, chromium & chromiumOS have fixed our symbolizer to handler dwarf v4 line tables. We don't need any further fixes or workarounds. I wasn't actually requesting the flag, I was only asking if one existed, and if not, why not. I've got my answers now.

llvmbot commented 7 years ago

I got an impression from reading comments 30 and 32 that Chromium already has a workaround for symbolizer. Using -mllvm option will be just another workaround, but it will take effort to write/test/review/commit and since we will never find out when this workaround is not needed anymore, this code will live forever in LLVM.

It's not too much effort to implement the backend option, but the question is whether it is necessary (and wise) to do so.

echristo commented 7 years ago

So Chromium would need to use an -mllvm option until they can fix their tools? Seems like the most reasonable way forward.

Either that or just go back to -gdwarf-2 until they've got something that works.

pogo59 commented 7 years ago

So Chromium would need to use an -mllvm option until they can fix their tools? Seems like the most reasonable way forward.

echristo commented 7 years ago

I was going to add this option as a temporarily workaround for Chromium only (just to keep things going) and I wanted to remove it later. If the bugs in symbolizer are fixed, I don't see a reason to have it.

Cons:

  • It will allow the user to generate the code not according to the DWARF spec. We don't want to encourage this behavior by implementing this option.
  • Code pollution. Both BE and FE patches are going to be quite large (even if it's implemented as -CC1 option only). If we want to implement it as a driver option, it will be even larger and more invasive patch. Add the cost associated with it (e.g. code reviews, etc).
  • Not all platforms want this option (see comment from Adrian). That means more work is needed to enable/disable it for different platforms by different code owners. Consequentially, more commits, more code reviews, more code bloat.

Pros: ? (I'm not aware of any).

I'm not aware of any either and don't want the option in clang.

cmtice commented 7 years ago

Ok, thanks for answering my questions!

llvmbot commented 7 years ago

I was going to add this option as a temporarily workaround for Chromium only (just to keep things going) and I wanted to remove it later. If the bugs in symbolizer are fixed, I don't see a reason to have it.

Cons:

Pros: ? (I'm not aware of any).

cmtice commented 7 years ago

Ah yes, that is exactly what I was wondering about (somehow I overlooked that part of those comments). What was the final status of that flag?

llvmbot commented 7 years ago

Hi Caroline, Let me sure I understand your question. Are you talking about providing a separate option to be able to control DWARF version and line table version independently (i.e. basically what I proposed in this bugzilla in comments 9 and 10)?

cmtice commented 7 years ago

Just a question (this came up in a side discussion with some of our team the other day): Is there an LLVM flag to control what dwarf version level is used for this? If not, is there any reason why one couldn't/shouldn't be created?

llvmbot commented 7 years ago

Thank you, everyone, for your analysis and testing!

markmentovai commented 7 years ago

On Chrome’s side, there may be a couple of merges and rebuilds of tools until DWARF 4 line tables are supported everywhere, and we may still find other weirdo things that need to be updated. But in general, I don’t think that anyone should treat us as blocking DWARF 4 line tables in LLVM. I don’t think that it was even Nico’s intent for Chrome to block this when he filed the bug, but I guess that’s how it wound up playing out…

cmtice commented 7 years ago

I was able to use my previous setup to test the workaround; I think I'm done with testing.

llvmbot commented 7 years ago

Great! Thank you! Do you need more time to rerun the testing after the workaround was done? It will take some time to post the review upstream and get is accepted anyway. The patch had to change slightly to compensate for the changes that occurred upstream since it was accepted last time.

cmtice commented 7 years ago

We have found the problem in our symbolizer and have found a workaround for our problem. I believe you should be ok with committing your patch. Thanks!

cmtice commented 7 years ago

I agree that the problem is probably with our symbolizer; I am currently looking into this further to see if that is in fact the case and what our options are.

llvmbot commented 7 years ago

Hi Caroline,

At first, I wanted to say thank you for doing all the testing.

Your results are exactly what I expected (I just hoped that it would be fixed by now). As Paul said in comment 27, it's very likely that symbolizer simply doesn't support line table V4 (which, after the fix, are generating now). Line table V4 format is different from V2 and V3 and special handing in symbolizer will be necessary. Could you please talk to the symbolizer code owner and confirm our assumption.

Maybe all you need to do is to upgrade to the latest version of the symbolizer, or if line table V4 are still not supported, file a bug against symbolizer. Please let me know the results of your conversation.

Thank you!

pogo59 commented 7 years ago

Caroline, do you believe the symbolizer supports v4 line tables? The header format is slightly different so the code that reads the line table does have to understand the new version. If the problem is that your symbolizer is out of date, then I think the correct solution is to update your symbolizer.

cmtice commented 7 years ago

Ok I tested your patch (leaving it with ">= 5"), and I have bad news for you. When I build Chrome with LLVM-without-the-patch, crash Chrome, and symbolize the stackdump, I get this:

Thread 0 (crashed) 0 chrome!content::CrashIntentionally() [render_frame_impl.cc : 743 + 0x0] rax = 0x447f28a70d273700 rdx = 0x0000000000000159 rcx = 0x0000000000000027 rbx = 0x00007ffe864bfc18 rsi = 0x0000000000000000 rdi = 0x00007fb8e2035d90 rbp = 0x00007ffe864bfc00 rsp = 0x00007ffe864bfc00 r8 = 0x0000000000000000 r9 = 0x00007fb8d8008780 r10 = 0x76616e2072657375 r11 = 0x0000000000000000 r12 = 0x00007ffe864c0248 r13 = 0x000015aa0a6cd000 r14 = 0x00007ffe864bfc10 r15 = 0x00007ffe864c0248 rip = 0x00007fb8df361010 Found by: given as instruction pointer in context 1 chrome!content::MaybeHandleDebugURL(GURL const&) [render_frame_impl.cc : 815

etc.

When I build LLVM with your patch, build Chrome with that version of LLVM, crash Chrome and symbolize the stack dump, I get the following:

Thread 0 (crashed) 0 chrome + 0x57f3010 rax = 0xdbbbe17388cc8f00 rdx = 0x00000000000000bf rcx = 0x0000000000000041 rbx = 0x00007ffe19e69b78 rsi = 0x0000000000000000 rdi = 0x00007f9e29966d90 rbp = 0x00007ffe19e69b60 rsp = 0x00007ffe19e69b60 r8 = 0x0000000000000000 r9 = 0x00007f9e1f939780 r10 = 0x76616e2072657375 r11 = 0x0000000000000000 r12 = 0x00007ffe19e6a1a8 r13 = 0x00002c2a91ad1000 r14 = 0x00007ffe19e69b70 r15 = 0x00007ffe19e6a1a8 rip = 0x00007f9e26c92010 Found by: given as instruction pointer in context 1 chrome + 0x57f31cd rbp = 0x00007ffe19e69d10 rsp = 0x00007ffe19e69b70 rip = 0x00007f9e26c921cd Found by: previous frame's frame pointer 2 chrome + 0x581cd6d rbp = 0x00007ffe19e69d90 rsp = 0x00007ffe19e69d20 rip = 0x00007f9e26cbbd6d Found by: previous frame's frame pointer 3 chrome + 0x58078a8 rbp = 0x00007ffe19e69f60 rsp = 0x00007ffe19e69da0 rip = 0x00007f9e26ca68a8 Found by: previous frame's frame pointer 4 chrome + 0x57fb461 rbp = 0x00007ffe19e6a010 rsp = 0x00007ffe19e69f70 rip = 0x00007f9e26c9a461 Found by: previous frame's frame pointer 5 chrome + 0x57fb2c3 rbp = 0x00007ffe19e6a3e0 rsp = 0x00007ffe19e6a020 rip = 0x00007f9e26c9a2c3 Found by: previous frame's frame pointer 6 chrome + 0x57f8d40 rbp = 0x00007ffe19e6a520 rsp = 0x00007ffe19e6a3f0 rip = 0x00007f9e26c97d40 Found by: previous frame's frame pointer etc.

Note the absence of function names, file names and line numbers in the latter case. So I'm afraid there is something else going on with the Dwarf V4 Line tables that your patch is not fixing for us. :-(

llvmbot commented 7 years ago

Hi Caroline,

The intention was to have LineTable version to match DwarfInfo version. I.e. for DwarfInfo V3, we should have LineTable V3, for DwarfInfo V4, we should have LineTable V4, etc.

Recently, support for DwarfInfo version 5 was (partially) added to LLVM, however support for LineTableVersion 5 has not been added yet.

While eventually we want to set LineTable V5 when DwarfVersion V5 is requested, for now (just to make new DebugInfo V5 Clang/LLVM tests happy), I had to temporarily downgrade LineTable Version to 2.

However, for DwarfInfo V4 we still want to generate LineTable V4. If my intuition is right, I think that's exactly where Chromium tools have problem with (i.e. DwarfInfo V4 + LineTable V4 are generated, but some tools don't support LineTable V4, since its format is different from LineTable V3 and V2).

I'm not sure if I explained things clear enough. But basically, the bottom line is that we don't want to change the code as you suggested, because it will defeat the purpose of the fix.

cmtice commented 7 years ago

So far I've found one small problem in the patch (easily fixed):

In the change to lib/MC/MCDwarf.cpp around line 277, you have

I think it needs to be

( check less-than-or-equal to 4, not 5).

I'm testing with that change and will let you know if I find any other issues.

cmtice commented 7 years ago

I'm in the middle of evaluating your patch; I will let you know the results as soon as I have them (should be in a day or so).

llvmbot commented 7 years ago

Hi Caroline, Sorry to bother you. Have you had a chance to evaulate if this patch creates problem for chromiom tools? We'd like to commit this patch upstream. Thank you! Katya.

llvmbot commented 7 years ago

Hi Caroline, Sorry for the late reply. I was away for the whole month, just came back. I attached the patch. Please test it with your code when you have a chance. Currently, if Dwarf V5 is requested, we had to fall back to line table V2 (since line table V5 are not supported yet by some LLVM tools, e.g. llvm-dwarfdump) When line table V5 is supported, this "FIXME" code will be removed.

// FIXME: Right now the compiler doesn't support line table V5. Until it's // supported keep generating line table V2, when Dwarf Info version V5 is used. if (LineTableVersion >= 5) LineTableVersion = 2;

In all other cases, the Dwarf version should be the same as the line table version.

If not fixed yet, most likely you will encounter a problem where Dwarf V4 is requested and the compiler will start generate matching line table V4, but its format is different from versions 2 & 3 and needs to be handled differently by the consumer tools.

llvmbot commented 7 years ago

patch to be tested by Chromium team

llvmbot commented 7 years ago

It seems that all of these new failures are attributed to the crash in llvm-dwarfdump that happens when we produce the DWARF version 5 object file and use llvm-dwarfdump to dump its line table information (version 5). The line table information V5 has not been supported by the compiler yet, hence the assert.

I spoke with Paul Robinson who is currently working on implementing Dwarf V5 in the compiler. He suggested to make a change and not to generate line table version greater than V4 for a time being. This temporary fix will get removed when line table V5 is fully supported by the compiler.

llvmbot commented 7 years ago

The following 7 tests are failing: Failing Tests (7): LLVM :: CodeGen/X86/dwarf-headers.ll LLVM :: DebugInfo/Generic/nodebug.ll LLVM :: DebugInfo/X86/atomic-c11-dwarf-5.ll LLVM :: DebugInfo/X86/default-subrange-array.ll LLVM :: DebugInfo/X86/inline-namespace.ll LLVM :: MC/ARM/dwarf-asm-multiple-sections.s LLVM :: MC/ELF/cfi-version.ll

Command Output (stderr):

llvm-dwarfdump: /home/kromanova/ngh/ToT_commit/llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp:234: bool llvm::DWARFDebugLine::Prologue::parse(const llvm::DWARFDataExtractor&, uint32_t*): Assertion `getAddressSize() == DebugLineData.getAddressSize() && "Line table header and data extractor disagree"' failed.

​0 0x00000000005c29e5 llvm::sys::PrintStackTrace(llvm::raw_ostream&) /home/kromanova/ngh/ToT_commit/llvm/lib/Support/Unix/Signals.inc:398:0

​1 0x00000000005c2a78 PrintStackTraceSignalHandler(void*) /home/kromanova/ngh/ToT_commit/llvm/lib/Support/Unix/Signals.inc:462:0

​2 0x00000000005c0cbf llvm::sys::RunSignalHandlers() /home/kromanova/ngh/ToT_commit/llvm/lib/Support/Signals.cpp:49:0

​3 0x00000000005c225a SignalHandler(int) /home/kromanova/ngh/ToT_commit/llvm/lib/Support/Unix/Signals.inc:252:0

​4 0x00007fd2b4fe3390 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x11390)

​5 0x00007fd2b3f59428 gsignal /build/glibc-bfm8X4/glibc-2.23/signal/../sysdeps/unix/sysv/linux/raise.c:54:0

​6 0x00007fd2b3f5b02a abort /build/glibc-bfm8X4/glibc-2.23/stdlib/abort.c:91:0

​7 0x00007fd2b3f51bd7 __assert_fail_base /build/glibc-bfm8X4/glibc-2.23/assert/assert.c:92:0

​8 0x00007fd2b3f51c82 (/lib/x86_64-linux-gnu/libc.so.6+0x2dc82)

​9 0x0000000000451e69 llvm::DWARFDebugLine::Prologue::parse(llvm::DWARFDataExtractor const&, unsigned int*) /home/kromanova/ngh/ToT_commit/llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp:235:0

​10 0x000000000045299a llvm::DWARFDebugLine::LineTable::parse(llvm::DWARFDataExtractor const&, unsigned int*) /home/kromanova/ngh/ToT_commit/llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp:402:0

​11 0x00000000004528fd llvm::DWARFDebugLine::getOrParseLineTable(llvm::DWARFDataExtractor const&, unsigned int) /home/kromanova/ngh/ToT_commit/llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp:390:0

​12 0x0000000000418d49 llvm::DWARFContext::getLineTableForUnit(llvm::DWARFUnit*) /home/kromanova/ngh/ToT_commit/llvm/lib/DebugInfo/DWARF/DWARFContext.cpp:586:0

​13 0x0000000000465274 dumpAttribute(llvm::raw_ostream&, llvm::DWARFDie const&, unsigned int*, llvm::dwarf::Attribute, llvm::dwarf::Form, unsigned int, llvm::DIDumpOptions) /home/kromanova/ngh/ToT_commit/llvm/lib/DebugInfo/DWARF/DWARFDie.cpp:104:0

​14 0x0000000000466c47 llvm::DWARFDie::dump(llvm::raw_ostream&, unsigned int, unsigned int, llvm::DIDumpOptions) const /home/kromanova/ngh/ToT_commit/llvm/lib/DebugInfo/DWARF/DWARFDie.cpp:343:0

​15 0x0000000000466ce2 llvm::DWARFDie::dump(llvm::raw_ostream&, unsigned int, unsigned int, llvm::DIDumpOptions) const /home/kromanova/ngh/ToT_commit/llvm/lib/DebugInfo/DWARF/DWARFDie.cpp:350:0

​16 0x000000000048482c llvm::DWARFCompileUnit::dump(llvm::raw_ostream&, llvm::DIDumpOptions) /home/kromanova/ngh/ToT_commit/llvm/lib/DebugInfo/DWARF/DWARFCompileUnit.cpp:33:0

​17 0x0000000000416395 llvm::DWARFContext::dump(llvm::raw_ostream&, llvm::DIDumpOptions) /home/kromanova/ngh/ToT_commit/llvm/lib/DebugInfo/DWARF/DWARFContext.cpp:212:0

​18 0x00000000004073fe DumpObjectFile(llvm::object::ObjectFile&, llvm::Twine) /home/kromanova/ngh/ToT_commit/llvm/tools/llvm-dwarfdump/llvm-dwarfdump.cpp:97:0

​19 0x0000000000407624 DumpInput(llvm::StringRef) /home/kromanova/ngh/ToT_commit/llvm/tools/llvm-dwarfdump/llvm-dwarfdump.cpp:123:0

​20 0x000000000040e871 void (std::for_each<gnu_cxx::__normal_iterator<std::cxx11::basic_string<char, std::char_traits, std::allocator >, std::vector<std::cxx11::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::cxx11::basic_string<char, std::char_traits, std::allocator > > > >, void ()(llvm::StringRef)>(gnu_cxx::__normal_iterator<std::cxx11::basic_string<char, std::char_traits, std::allocator >, std::vector<std::cxx11::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::cxx11::basic_string<char, std::char_traits, std::allocator > > > >, gnu_cxx::__normal_iterator<std::cxx11::basic_string<char, std::char_traits, std::allocator >, std::vector<std::cxx11::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::cxx11::basic_string<char, std::char_traits, std::allocator > > > >, void ()(llvm::StringRef)))(llvm::StringRef) /usr/include/c++/5/bits/stl_algo.h:3766:0

​21 0x000000000040879e main /home/kromanova/ngh/ToT_commit/llvm/tools/llvm-dwarfdump/llvm-dwarfdump.cpp:233:0

​22 0x00007fd2b3f44830 __libc_start_main /build/glibc-bfm8X4/glibc-2.23/csu/../csu/libc-start.c:325:0

​23 0x0000000000407029 _start (/home/kromanova/ngh/ToT_commit/build/./bin/llvm-dwarfdump+0x407029)

Stack dump:

  1. Program arguments: /home/kromanova/ngh/ToT_commit/build/./bin/llvm-dwarfdump - FileCheck error: '-' is empty. FileCheck command line: /home/kromanova/ngh/ToT_commit/build/./bin/FileCheck /home/kromanova/ngh/ToT_commit/llvm/test/MC/ELF/cfi-version.ll --check-prefix=DWARF4
llvmbot commented 7 years ago

I have just reapplied my old patch that was reverted r285355: Update .debug_line section version information to match DWARF version and ran LLVM and Clang regression suite.

The patch applied cleanly (with some fuzz) to the latest compiler sources. However, some new tests were added since January, when my commit was reverted and 7 of these test are failing now. All of these failing test invoke llvm-dwarfdump to dump DI.

The failure message is almost certain points to the llvm-dwarfdump inability to handle line table V4. I suspect, that was very similar to what Nico had experienced in the past with the Chromium build failure.

While I'm looking at the failures with llvm-dwarfdump, you could go ahead and cherry-pick r285355 and see if it breaks your builds (assuming that you are not using llvm-dwarfdump, since I know it fail for sure). If you prefer me to send you the patch directly, please let me know your email address.

One more complication: I'm away for 3 weeks vacation starting tomorrow. I'm not sure if you could do the testing today/tonight, but it's unlikely that the llvm-dwarfdump failure won't be fixed/reviewed and committed until the end of the day. However, I think I will have time to analyze the failure and file a bug against llvm-dwarfdump.

cmtice commented 7 years ago

Hi Katya,

Please do give me a patch so we can test it with out stuff. I do not know offhand if our DI consumers can handle that new field, but I will also check. I will also try to coordinate some testing with Nico. Thanks!

llvmbot commented 7 years ago

Caroline, Nice to hear from you :) Do you prefer me to reland the patch and see what happens or to give you the patch so you could do some preliminary testing building Chromium?

I suspect that Chromium build failed after my fix because if you are using DWARF4, then DWARF4 line_tables will start getting generated. DWARF4 line table has one additional (new) field, called 'maximum_operations_per_instruction' that needs to be supported by the debug info consumers (debuggers, dwarf dumpers and such).

Do you know if your DI consumers could handle this new field?

cmtice commented 7 years ago

It's hard to be sure if this is the cause/solution to Nico's problem, as his problem description is very vague (nonexistent!) and he has not given any way to reproduce it. However...we did run into a problem in ChromeOS with breakpad, symbolization, and LLVM. The symptom we were seeing was that we would get plain function names for functions, but they did not include the full class name (and C++ mangling info). In our case the problem was related to the fact that we use split debug AND to the fact the LLVM by default uses something called 'gmit' which causes it to generate function entries in both the .debug file and the .dwp file. The symbolizer thing was reading BOTH the .debug file and the .dwp file, but it was not recognizing that the entries were for the same function, so we were ending up with two entries per function, the first one (from the .debug file) without the class name and the second one with the fully qualified name. The crash reporter was finding the first entry for the function and using it, thus it appeared that we did not have full/correct debug info for the functions.

In this case, the solution for us (in ChromeOS) was to turn off gmit, by using the "-fno-split-dwarf-inlining" flag, so that there's only one (correct and fully qualified) entry per function.

But again, I don't know if this is Nico's issue, as he has not given enough details.

Another symbolization issue we ran into, which also might be this problem, was that Breakpad cannot handle Dwarf V4 CIE records. We "fixed" this by always adding a local patch to our LLVM that forces LLVM to emit Dwarf V1 CIE records (which is what GCC was doing): --- a/lib/MC/MCDwarf.cpp +++ b/lib/MC/MCDwarf.cpp @@ -1252,7 +1252,8 @@ static unsigned getCIEVersion(bool IsEH, unsigned DwarfVersio n) { return 3; case 4: case 5:

I know this isn't an acceptable upstream fix, but it might be something Nico could use...

llvmbot commented 7 years ago

Great! Thank you so much, Eric! I definitely don't want to submit the work that is needed only temporarily or not needed at all. Of course, I wanted to land the patch that sets up 'line table version' == 'dwarf info version' but I started to work on the larger patch that allows to independently set up line table version just to give a way for Google to work around issues with the DI consumers that don't understand line table version 4 and later and, of course, to move things forward with my original patch.

I don't know what is the proper protocol for relanding the patch that was reverted a while ago. Should I reopen the "old" phabricator review or open a brand new one? What is more convenient for you? Will it be possible to build Chromium with the latest compiler + line table version patch before approving the review (to avoid error reports a day or two later)? I suspect it's difficult to find out now what exactly had caused the failure with Chromium builds earlier - line table version 4 or something else - but rebuilding Chromium with the latest compiler + patch should be doable.

Thanks again!

echristo commented 7 years ago

That's not going to work as "per platform" would be "all dwarf platforms" as far as chromium is concerned.

I've added Luis here, but I think one of two things:

a) That this problem has been fixed in the chromium tools, b) That this is a bug in the chromium tools that will need to be worked around or fixed in the near term while the tools are fixed.

Ultimately I think we can probably just reland this patch shortly (please add me as the reviewer and I'll ack it), and Luis and I can talk about what he needs for Chromium elsewhere.

adrian-prantl commented 7 years ago

I also wanted to find out if Debug Info experts think that Clang option for setting up the value of the line table version will be valuable in general

Probably sufficient to add this as a cc1 option and then have the driver enable it as applicable per platform.

llvmbot commented 7 years ago

I also wanted to find out if Debug Info experts think that Clang option for setting up the value of the line table version will be valuable in general, not only as a temporary workaround for some Google's tools that allegedly do not support line table version 4. I said "allegedly", since the exact problem that caused the build failure for Chromium was never reported and I could only assume that it is caused by line table version 4.

The patch that I am working on is pretty large and it in addition to my time implementing/testing, it will also take some time from the reviewers. If this patch is only needed as a temporary solution until the problem with Google's tools is fixed, this patch will be polluting compiler's source, probably forever. Nothing is more permanent than the temporary..

llvmbot commented 7 years ago

Hi Nico,

It's been more than 6 months since the fix got reverted and I still haven't gotten a reproducible :( I need to put the fix back. Paul R. needs it for this DWARF-5 related work. If you came up a reproducible that caused the failure - let me know, I will work on the fix.

Otherwise, I assume that your debug info consumers simply couldn't handle line table version 4 (which is not compatible with line table table version 2 & 3). When you are setting debug info version 4 (which after my commit forced matching line table version 4 to get generated) it causes problems for your debug info consumers. Does this assumption looks right to you?

I am working on a more extensive fix that could solve the problem above. It will allow the user to set up any valid values for the line table version independently from dwarf version. When this fix is in, you could set up the debug info version to 4 and line table version to 2. Do you think this will solve your problem?

I am planning to add a new frontend option -dwarf-line-table-version= which will allow to set up line table version independently from the debug info version.

I'd like to get a preliminary opinion from debug info experts to make sure that what I'm planning to implement is going to be satisfactory.

  1. Add new option to -CC1, called -dwarf-line-table-version=. If this option is not specified, by default line table version value will be same as debug info version value. Note: I was not planning to add a driver option for this, just CC1 option.

  2. Add new metadata to the IR called "Line Table Version" Something like: !​4 = !{i32 2, !"Line Table Version", i32 3}

  3. In the backend, modify the reverted patch r289925 (this patch used set up the value of the line table version to be equal to the value of the dwarf version; currently we always have line table version 2) to read the value of the line table version from metadata. If not present, in the backend set up line table version value will be equal to debug info version value.

pogo59 commented 7 years ago

@​Nico, ping. Not having this patch in is now starting to block my DWARF v5 work. We have been waiting 4 months to get this resolved. Thanks, --paulr

llvmbot commented 7 years ago

Yes, I'll work on a repro. Sorry about the delay, I've been traveling.

Hi Nico, Have you had a chance to create a repro? I suspect that the problem is not with the patch itself, but with the DWARF consumers not being able to handle debug_line section version 4, since its format is slightly different from previous versions 2 and 3. The main difference is one additional field maximum_operations_per_instruction that was added.

If it's taking too long to investigate and/or create a reproducible, maybe I should add an option, that will cause debug_line table version 2 generated, independently from which dwarf version was used?

I think that by default, we should still generate the version of debug_line section that matches dwarf version that is being generated.

Please let me know. Our debugger folks really want to have this patch committed.

nico commented 7 years ago

Yes, I'll work on a repro. Sorry about the delay, I've been traveling.

llvmbot commented 7 years ago

Hi Nico, Do you have any update on the reproducible? If you want, I could implement an option to turn off the new behavior, as you suggested in your comment 3.

llvmbot commented 7 years ago

Hi Nico, Sorry to bother you. Could you please send me a repro, when you have a chance? Thank you! Katya.

nico commented 7 years ago

I can confirm that our "can we symbolize stacks?" test started passing again after the revert. I did some reading and it looks like that uses breakpad. I don't have a standalone repro yet, but should be able to get one on Monday. I'm sorry about the delay, I hope that's ok.

In case other tools should have similar problems, maybe it might make sense to put this behavior behind an (on by default) flag, so that people have an incremental transition path (get new clang, temporarily set the flag to turn off this new behavior, then make all their tools work, and then stop passing the flag).

llvmbot commented 7 years ago

Thank you, Nico! Sorry for the delay with the reply, I had to set up an account for the BZ before I could answer.

Thank you for the reverting the change. Let me know if/when you have a reproducible. If you end up resolving the issue on your end, let me know when I can re-commit.

Katya.

nico commented 7 years ago

svn merge -c -r289925Nicos-MBP:llvm-rw thakis$ svn merge -c -289925 . --- Reverse-merging r289925 into '.': U lib/MC/MCDwarf.cpp sU test/MC/ELF/debug-loc.s U test/MC/ELF/discriminator.s U test/MC/ELF/empty-dwarf-lines.s U test/MC/ELF/debug-line2.s U test/MC/ELF/debug-line.s U test/MC/MachO/gen-dwarf.s U test/MC/MachO/file.s U test/MC/MachO/loc.s U test/DebugInfo/Generic/empty.ll U test/DebugInfo/AArch64/line-header.ll U test/DebugInfo/X86/stmt-list-multiple-compile-units.ll U test/DebugInfo/X86/empty.ll D test/DebugInfo/X86/debug-line-version.s --- Recording mergeinfo for reverse merge of r289925 into '.': U . Nicos-MBP:llvm-rw thakis$ svn commit -m 'Speculatively revert r289925, see llvm/llvm-project#30755 ' Sending lib/MC/MCDwarf.cpp Sending test/DebugInfo/AArch64/line-header.ll Sending test/DebugInfo/Generic/empty.ll Deleting test/DebugInfo/X86/debug-line-version.s Sending test/DebugInfo/X86/empty.ll Sending test/DebugInfo/X86/stmt-list-multiple-compile-units.ll Sending test/MC/ELF/debug-line.s Sending test/MC/ELF/debug-line2.s Sending test/MC/ELF/debug-loc.s Sending test/MC/ELF/discriminator.s Sending test/MC/ELF/empty-dwarf-lines.s Sending test/MC/MachO/file.s Sending test/MC/MachO/gen-dwarf.s Sending test/MC/MachO/loc.s Transmitting file data ............. Committed revision 289944.