llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
25.94k stars 10.58k forks source link

Fix merging of debug_str_offsets with multiple contributions #90461

Open molar opened 2 weeks ago

molar commented 2 weeks ago

This pull request will change the merging of debug_str_offset to merge per contribution and correctly copy over each contribution header to the merged section. I have added some test data which is in dwarf5 format as this is where the section contribution header was introduced, as far as i can tell.

github-actions[bot] commented 2 weeks ago

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

llvmbot commented 2 weeks ago

@llvm/pr-subscribers-debuginfo

Author: Morten larsen (molar)

Changes This pull request will change the merging of ``debug_str_offset`` to merge per contribution and correctly copy over each contribution header to the merged section. I have added some test data which is in dwarf5 format as this is where the section contribution header was introduced, as far as i can tell. --- Full diff: https://github.com/llvm/llvm-project/pull/90461.diff 5 Files Affected: - (modified) llvm/lib/DWP/DWP.cpp (+33-9) - (added) llvm/test/tools/llvm-dwp/Inputs/merge_v5/notypes/ab.dwp () - (added) llvm/test/tools/llvm-dwp/Inputs/merge_v5/notypes/c.dwo () - (modified) llvm/test/tools/llvm-dwp/X86/merge.test (+5) - (added) llvm/test/tools/llvm-dwp/X86/merge_v5.test (+48) ``````````diff diff --git a/llvm/lib/DWP/DWP.cpp b/llvm/lib/DWP/DWP.cpp index 77bd22d1f071c6..fecd184ca68a86 100644 --- a/llvm/lib/DWP/DWP.cpp +++ b/llvm/lib/DWP/DWP.cpp @@ -415,6 +415,17 @@ Expected parseInfoSectionUnitHeader(StringRef Info) { return Header; } +static void writeNewOffsetsTo(MCStreamer &Out, DataExtractor &Data, + DenseMap &OffsetRemapping, + uint64_t &Offset, uint64_t &Size) { + + while (Offset < Size) { + auto OldOffset = Data.getU32(&Offset); + auto NewOffset = OffsetRemapping[OldOffset]; + Out.emitIntValue(NewOffset, 4); + } +} + void writeStringsAndOffsets(MCStreamer &Out, DWPStringPool &Strings, MCSection *StrOffsetSection, StringRef CurStrSection, @@ -439,17 +450,30 @@ void writeStringsAndOffsets(MCStreamer &Out, DWPStringPool &Strings, Out.switchSection(StrOffsetSection); - uint64_t HeaderSize = debugStrOffsetsHeaderSize(Data, Version); uint64_t Offset = 0; uint64_t Size = CurStrOffsetSection.size(); - // FIXME: This can be caused by bad input and should be handled as such. - assert(HeaderSize <= Size && "StrOffsetSection size is less than its header"); - // Copy the header to the output. - Out.emitBytes(Data.getBytes(&Offset, HeaderSize)); - while (Offset < Size) { - auto OldOffset = Data.getU32(&Offset); - auto NewOffset = OffsetRemapping[OldOffset]; - Out.emitIntValue(NewOffset, 4); + if (Version > 4) { + while (Offset < Size) { + uint64_t HeaderSize = debugStrOffsetsHeaderSize(Data, Version); + assert(HeaderSize <= Size - Offset && + "StrOffsetSection size is less than its header"); + + uint64_t ContributionEnd = 0; + uint64_t ContributionSize = 0; + uint64_t HeaderLengthOffset = Offset; + if (HeaderSize == 8) { + ContributionSize = Data.getU32(&HeaderLengthOffset); + } else if (HeaderSize == 16) { + HeaderLengthOffset += 4; // skip the dwarf64 marker + ContributionSize = Data.getU64(&HeaderLengthOffset); + } + ContributionEnd = ContributionSize + HeaderLengthOffset; + Out.emitBytes(Data.getBytes(&Offset, HeaderSize)); + writeNewOffsetsTo(Out, Data, OffsetRemapping, Offset, ContributionEnd); + } + + } else { + writeNewOffsetsTo(Out, Data, OffsetRemapping, Offset, Size); } } diff --git a/llvm/test/tools/llvm-dwp/Inputs/merge_v5/notypes/ab.dwp b/llvm/test/tools/llvm-dwp/Inputs/merge_v5/notypes/ab.dwp new file mode 100644 index 00000000000000..a2388f0e37dff0 Binary files /dev/null and b/llvm/test/tools/llvm-dwp/Inputs/merge_v5/notypes/ab.dwp differ diff --git a/llvm/test/tools/llvm-dwp/Inputs/merge_v5/notypes/c.dwo b/llvm/test/tools/llvm-dwp/Inputs/merge_v5/notypes/c.dwo new file mode 100644 index 00000000000000..e58a4b0c34d0bb Binary files /dev/null and b/llvm/test/tools/llvm-dwp/Inputs/merge_v5/notypes/c.dwo differ diff --git a/llvm/test/tools/llvm-dwp/X86/merge.test b/llvm/test/tools/llvm-dwp/X86/merge.test index 0cf56bd858a04b..25a1315ee63f33 100644 --- a/llvm/test/tools/llvm-dwp/X86/merge.test +++ b/llvm/test/tools/llvm-dwp/X86/merge.test @@ -44,3 +44,8 @@ CHECK: Index Signature INFO ABBREV CHECK-DAG: [[DWOC]] [0x00000000[[#COFF]], 0x00000000[[#AOFF]]) [0x0000[[CAOFF]], 0x0000[[AAOFF]]) [0x00000000, 0x00000011) [0x00000000, 0x00000018) CHECK-DAG: [[DWOA]] [0x00000000[[#AOFF]], 0x00000000[[#BOFF]]) [0x0000[[AAOFF]], 0x0000[[BAOFF]]) [0x00000011, 0x00000022) [0x00000018, 0x00000028) CHECK-DAG: [[DWOB]] [0x00000000[[#BOFF]], 0x00000000[[#XOFF]]) [0x0000[[BAOFF]], 0x000000c3) [0x00000022, 0x00000033) [0x00000028, 0x0000003c) + +CHECK-LABEL: .debug_str_offsets.dwo contents: +CHECK: Contribution size = 24, Format = DWARF32, Version = 4 +CHECK: Contribution size = 16, Format = DWARF32, Version = 4 +CHECK: Contribution size = 20, Format = DWARF32, Version = 4 diff --git a/llvm/test/tools/llvm-dwp/X86/merge_v5.test b/llvm/test/tools/llvm-dwp/X86/merge_v5.test new file mode 100644 index 00000000000000..ea9ef030e5d05b --- /dev/null +++ b/llvm/test/tools/llvm-dwp/X86/merge_v5.test @@ -0,0 +1,48 @@ +RUN: llvm-dwp %p/../Inputs/merge_v5/notypes/c.dwo %p/../Inputs/merge_v5/notypes/ab.dwp -o - | \ +RUN: llvm-dwarfdump -v - | FileCheck --check-prefix=CHECK %s + +DWP from a DWO (c.dwo) and a DWP (ab.dwp, created from a.dwo and b.dwo) +Make sure the entries for A and B are updated correctly when read/processed from ab.dwp +a.cpp: + struct foo { }; + foo a; + +b.cpp: + struct bar { }; + void b(bar) { + } + +c.cpp: + typedef int baz; + baz c() { + } + +CHECK-LABEL: .debug_abbrev.dwo contents: +CHECK-LABEL: Abbrev table for offset: +CHECK: 0x0000[[CAOFF:.*]] +CHECK-LABEL: Abbrev table for offset: +CHECK: 0x0000[[AAOFF:.*]] +CHECK-LABEL: Abbrev table for offset: +CHECK: 0x0000[[BAOFF:.*]] + +CHECK: .debug_info.dwo contents: +CHECK: 0x[[#%.8x,COFF:]]: +CHECK-LABEL: Compile Unit: length = {{.*}}, format = DWARF32, version = 0x0005, unit_type = DW_UT_split_compile, abbr_offset = +CHECK: 0x[[CAOFF]], addr_size = 0x08, DWO_id = 0x[[DWOC:.*]] (next unit at 0x[[#%.8x,BOFF:]]) +CHECK: [[#BOFF]]: +CHECK-LABEL: Compile Unit: length = {{.*}}, format = DWARF32, version = 0x0005, unit_type = DW_UT_split_compile, abbr_offset = +CHECK: 0x[[BAOFF]], addr_size = 0x08, DWO_id = 0x[[DWOB:.*]] (next unit at 0x[[#%.8x,AOFF:]]) +CHECK: [[#AOFF]]: +CHECK-LABEL: Compile Unit: length = {{.*}}, format = DWARF32, version = 0x0005, unit_type = DW_UT_split_compile, abbr_offset = +CHECK: 0x[[AAOFF]], addr_size = 0x08, DWO_id = 0x[[DWOA:.*]] (next unit at 0x[[#%.8x,XOFF:]]) + +CHECK-LABEL: .debug_cu_index contents: +CHECK: Index Signature INFO ABBREV STR_OFFSETS +CHECK-DAG: [[DWOC]] [0x00000000[[#COFF]], 0x00000000[[#BOFF]]) [0x0000[[CAOFF]], 0x0000[[AAOFF]]) [0x00000000, 0x00000024) +CHECK-DAG: [[DWOB]] [0x00000000[[#BOFF]], 0x00000000[[#AOFF]]) [0x0000[[BAOFF]], +CHECK-DAG: [[DWOA]] [0x00000000[[#AOFF]], 0x00000000[[#XOFF]]) [0x0000[[AAOFF]], + +CHECK-LABEL: .debug_str_offsets.dwo contents: +CHECK: Contribution size = 32, Format = DWARF32, Version = 5 +CHECK: Contribution size = 24, Format = DWARF32, Version = 5 +CHECK: Contribution size = 28, Format = DWARF32, Version = 5 ``````````
molar commented 2 days ago

Ping @dwblaikie kindly requesting a review

dwblaikie commented 2 days ago

Thanks!

Generally I think we're moving away from checked in binary object files - could you make the test from checked in assembly instead (perhaps using @MaskRay's new https://llvm.org/docs/TestingGuide.html#elaborated-tests perhaps)

So if I'm understanding correctly - you're saying llvm-dwp did not work when an input was a DWP and was DWARFv5? huh, surprising oversight... oh, I guess not that many people use dwp files as input to dwp actions, so I can see how some people would've been working along fine with this state of affairs.

felipepiovezan commented 1 day ago

Generally I think we're moving away from checked in binary object files - could you make the test from checked in assembly instead (perhaps using @MaskRay's new https://llvm.org/docs/TestingGuide.html#elaborated-tests perhaps)

I think the dwarf YAML support should work for debug_str_offsets too?