llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.35k stars 11.7k forks source link

a missing `type` field in `DISubprogram` does not give a warning but a Segmentation fault #59471

Open amtoine opened 1 year ago

amtoine commented 1 year ago

hello there :wave: :yum:

i've followed your contributing guidelines to the "Crashing bugs" section, however i cannot go further in the bug report flow, i get an error which is not mentionned in the section :open_mouth: when i add -emit-llvm as given in the section, i get

> clang-15 -emit-llvm foo.main.ll foo.ll -o foo.elf
clang-15: error: -emit-llvm cannot be used when linking

i've tried to be as concise as possible, given the sizes of the files i manipulate, with simple examples, but also as complete as possible... i hope this is not too large nor to small :relieved: please fell free to ask anything :wink:

a tiny bit of context

i am writing a compiler for the Oberon language :+1: this is a work in progres, but it is already able to take some oberon input, translate it to llvm IR and finally use clang to optimize the IR and generate some machine code binary for the host target :ok_hand:

EDIT

i've changed the example for an even simpler one

my issue

i have the following oberon source file

; foo.obn
MODULE foo;
    VAR k: INTEGER;
BEGIN
    k := 0
END foo.

the compiler turns it into the two following IR files, thanks to a new -g option i'm currently working on! i.e. the command is here ./oberon -g compile foo.obn

; foo.main.ll
target triple = "x86_64-unknown-linux-gnu"

declare i32 @foo.__main()

define i32 @main() {
  call i32 @foo.__main()
  ret i32 0
}

and

; foo.ll
target triple = "x86_64-unknown-linux-gnu"

!0 = !DIFile(
  filename: "foo.obn",
  directory: "/home/amtoine/.local/share/ghq/github.com/oberonforall/compiler",
  checksumkind: CSK_MD5,
  checksum: "f4e60ed3f396da2f9a168da6602e890c"
)

!2 = !DIBasicType(name: "char", size: 8, encoding: DW_ATE_signed_char)
!3 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
!4 = !DIBasicType(name: "long", size: 64, encoding: DW_ATE_float)
!5 = !DIBasicType(name: "double", size: 64, encoding: DW_ATE_signed)

!6 = distinct !DIGlobalVariable(name: "k", scope: !1, file: !0, line: 2, type: !5, isLocal: false, isDefinition: true)
!7 = distinct !DIGlobalVariableExpression(var: !6, expr: !DIExpression())
@foo.k = dso_local global i32 0, align 4, !dbg !7

!8 = !{!7}

!1 = distinct !DICompileUnit(
  language: DW_LANG_C99,
  file: !0,
  producer: "clang version 15.0.5",
  isOptimized: false,
  runtimeVersion: 0,
  emissionKind: FullDebug,
  globals: !8,
  splitDebugInlining: false,
  nameTableKind: None
)
;!9 = !DISubroutineType()

!10 = distinct !DISubprogram(
  name: "main",
  scope: !0,
  file: !0,
  line: 3,
  scopeLine: 3,
  flags: DIFlagPrototyped,
  spFlags: DISPFlagDefinition,
  unit: !1,
  retainedNodes: !{}
)

define dso_local i32 @foo.__main() !dbg !10 {
  store i32 0, i32* @foo.k, align 4, !dbg !DILocation(line: 5, column: 3, scope: !10)

  ret i32 0, !dbg !DILocation(line: 5, column: 9, scope: !10)
}

!llvm.dbg.cu = !{!1}

!11 = !{i32 2, !"Debug Info Version", i32 3}
!llvm.module.flags = !{!11}

NOTE: all of the above is generated dynamically by the compiler :warning:

finally, the error i get is the following

> clang-15 foo.main.ll foo.ll -o foo.elf | clip
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.  Program arguments: /home/amtoine/.local/share/clang-15/bin/clang-15 -cc1 -triple x86_64-unknown-linux-gnu -emit-obj -mrelax-all --mrelax-relocations -disable-free -clear-ast-before-backend -disable-llvm-verifier -discard-value-names -main-file-name foo.ll -mrelocation-model pic -pic-level 2 -pic-is-pie -mframe-pointer=all -fmath-errno -ffp-contract=on -fno-rounding-math -mconstructor-aliases -funwind-tables=2 -target-cpu x86-64 -tune-cpu generic -mllvm -treat-scalable-fixed-error-as-warning -debugger-tuning=gdb -fcoverage-compilation-dir=/home/amtoine/.local/share/ghq/github.com/oberonforall/compiler -resource-dir /home/amtoine/.local/share/clang-15/lib/clang/15.0.5 -fdebug-compilation-dir=/home/amtoine/.local/share/ghq/github.com/oberonforall/compiler -ferror-limit 19 -fgnuc-version=4.2.1 -fcolor-diagnostics -faddrsig -D__GCC_HAVE_DWARF2_CFI_ASM=1 -o /tmp/foo-3d58f3.o -x ir foo.ll
1.  Code generation
2.  Running pass 'Function Pass Manager' on module 'foo.ll'.
3.  Running pass 'X86 Assembly Printer' on function '@foo.__main'
 #0 0x0000561dea8042d3 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/amtoine/.local/share/clang-15/bin/clang-15+0x5a942d3)
 #1 0x0000561dea80223e llvm::sys::RunSignalHandlers() (/home/amtoine/.local/share/clang-15/bin/clang-15+0x5a9223e)
 #2 0x0000561dea80466f SignalHandler(int) Signals.cpp:0:0
 #3 0x00007f6820651a00 (/usr/lib/libc.so.6+0x38a00)
 #4 0x0000561deb540a5c llvm::DwarfCompileUnit::constructSubprogramScopeDIE(llvm::DISubprogram const*, llvm::LexicalScope*) (/home/amtoine/.local/share/clang-15/bin/clang-15+0x67d0a5c)
 #5 0x0000561deb528ce5 llvm::DwarfDebug::endFunctionImpl(llvm::MachineFunction const*) (/home/amtoine/.local/share/clang-15/bin/clang-15+0x67b8ce5)
 #6 0x0000561deb5112f1 llvm::DebugHandlerBase::endFunction(llvm::MachineFunction const*) (/home/amtoine/.local/share/clang-15/bin/clang-15+0x67a12f1)
 #7 0x0000561deb4ffc08 llvm::AsmPrinter::emitFunctionBody() (/home/amtoine/.local/share/clang-15/bin/clang-15+0x678fc08)
 #8 0x0000561de977e920 llvm::X86AsmPrinter::runOnMachineFunction(llvm::MachineFunction&) X86AsmPrinter.cpp:0:0
 #9 0x0000561de9db21cc llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/home/amtoine/.local/share/clang-15/bin/clang-15+0x50421cc)
#10 0x0000561dea17a43b llvm::FPPassManager::runOnFunction(llvm::Function&) (/home/amtoine/.local/share/clang-15/bin/clang-15+0x540a43b)
#11 0x0000561dea181663 llvm::FPPassManager::runOnModule(llvm::Module&) (/home/amtoine/.local/share/clang-15/bin/clang-15+0x5411663)
#12 0x0000561dea17affa llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/amtoine/.local/share/clang-15/bin/clang-15+0x540affa)
#13 0x0000561deaeeef83 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>) (/home/amtoine/.local/share/clang-15/bin/clang-15+0x617ef83)
#14 0x0000561deb2b0ffe clang::CodeGenAction::ExecuteAction() (/home/amtoine/.local/share/clang-15/bin/clang-15+0x6540ffe)
#15 0x0000561deb1f4e29 clang::FrontendAction::Execute() (/home/amtoine/.local/share/clang-15/bin/clang-15+0x6484e29)
#16 0x0000561deb16c2f6 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/home/amtoine/.local/share/clang-15/bin/clang-15+0x63fc2f6)
#17 0x0000561deb2aca9a clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/home/amtoine/.local/share/clang-15/bin/clang-15+0x653ca9a)
#18 0x0000561de8a782d2 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/home/amtoine/.local/share/clang-15/bin/clang-15+0x3d082d2)
#19 0x0000561de8a7664a ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) driver.cpp:0:0
#20 0x0000561de8a76451 clang_main(int, char**) (/home/amtoine/.local/share/clang-15/bin/clang-15+0x3d06451)
#21 0x00007f682063c290 (/usr/lib/libc.so.6+0x23290)
#22 0x00007f682063c34a __libc_start_main (/usr/lib/libc.so.6+0x2334a)
#23 0x0000561de8a72dba _start (/home/amtoine/.local/share/clang-15/bin/clang-15+0x3d02dba)
clang-15: error: unable to execute command: Segmentation fault (core dumped)
clang-15: error: clang frontend command failed due to signal (use -v to see invocation)
clang version 15.0.5
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/amtoine/.local/share/clang-15/bin
clang-15: note: diagnostic msg: Error generating preprocessed source(s) - no preprocessable inputs.

additional context

compiling without debugging information

a simpler ./oberon compile foo.obn gives me

; foo.main.ll
target triple = "x86_64-unknown-linux-gnu"

declare i32 @foo.__main()

define i32 @main() {
  call i32 @foo.__main()
  ret i32 0
}

and

target triple = "x86_64-unknown-linux-gnu"

@foo.k = dso_local global i32 0, align 4

define dso_local i32 @foo.__main() {
  store i32 0, i32* @foo.k, align 4

  ret i32 0
}

and

> clang-15 foo.main.ll foo.ll -o foo.elf
> ./foo.elf

works just fine :yum:

clang installation

i've installed the latest available precompiled binary from here, inside my ~/.local/share/ directory and the version is

> clang-15 --version
clang version 15.0.5
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/amtoine/.local/share/clang-15/bin

:ok_hand:

amtoine commented 1 year ago

actually, i think i found the real bug here :smirk:

the generation of my compiler looks ok, the only thing missing was...

the type field in the DISubprogram debug node for the foo.__main function :scream:

here is the diff to apply to the above foo.ll

diff --git a/foo.ll b/foo.ll
index c9dddce..e3e348e 100644
--- a/foo.ll
+++ b/foo.ll
@@ -29,13 +29,15 @@ target triple = "x86_64-unknown-linux-gnu"
   splitDebugInlining: false,
   nameTableKind: None
 )
-;!9 = !DISubroutineType()
+!100 = !{!3}
+!101 = !DISubroutineType(types: !100)

 !10 = distinct !DISubprogram(
   name: "main",
   scope: !0,
   file: !0,
   line: 3,
+  type: !101,
   scopeLine: 3,
   flags: DIFlagPrototyped,
   spFlags: DISPFlagDefinition,

and i get no errors on

> clang-15 foo.main.ll foo.ll -o foo.elf

and the result i wanted with gdb, that is

> gdb --silent --ex "list" --ex "quit" foo.elf
Reading symbols from foo.elf...
1   MODULE foo;
2       VAR k: INTEGER;
3   BEGIN
4       k := 0
5   END foo.

hurray :tada: :star_struck:

llvmbot commented 1 year ago

@llvm/issue-subscribers-debuginfo

dwblaikie commented 1 year ago

Yeah, looks like the LLVM IR verifier could be extended to document/test this property to give a more reliable/understandable failure.

amtoine commented 1 year ago

Yeah, looks like the LLVM IR verifier could be extended to document/test this property to give a more reliable/understandable failure.

that would be awesome :star_struck:

an idea

for instance, when i remove a mandatory field, e.g. the file in DICompileUnit, clang screams

> clang-15 foo.main.ll foo.ll -o foo.elf
foo.ll:37:1: error: missing required field 'file'
)
^
1 error generated.

this is nice and clear :yum:

llvmbot commented 1 year ago

@llvm/issue-subscribers-good-first-issue

dwblaikie commented 1 year ago

Yeah - would probably be fairly easy to implement - though perhaps there's some hidden complications (like maybe it's common not to have the type on some DISubprograms, but not others?) - someone should give it a go and see what happens :)

amtoine commented 1 year ago

Yeah - would probably be fairly easy to implement - though perhaps there's some hidden complications (like maybe it's common not to have the type on some DISubprograms, but not others?) - someone should give it a go and see what happens :)

the type field of DISubprogram is a mirror of the signature of the function whose "scope" will be this very DISuprogramm, right? does it make sense not having the signature of the function? :open_mouth:

i'm faaar from an expert, so i might very likely miss something here :wink:

pogo59 commented 1 year ago

You're feeding textual IR to clang (.ll file, not a .bc file)? I'd put the blame on the IR parser, in that case; it should have opinions about what fields are required or optional, and reject it immediately.

amtoine commented 1 year ago

You're feeding textual IR to clang (.ll file, not a .bc file)?

yes it is automatically generated plain text IR so .ll plain text files instead of bitcode .bc files :+1:

I'd put the blame on the IR parser, in that case; it should have opinions about what fields are required or optional, and reject it immediately.

would be sensible :relieved:

pogo59 commented 1 year ago

On the plus side, that parser (at least as far as the debug-info metadata is concerned) is not too hard to navigate. I'd still consider it a good first issue if you wanted to tackle it. Probably you would rather get back to your own project, that's fair too.

amtoine commented 1 year ago

@pogo59 that's interesting :thinking:

i'll get back to my project for the time being, but i'm quite interested in having a look at that :yum:

kzhuravl commented 10 months ago

Looks like this has been quite since end of Dec. Assigning to @epilk, let me know if someone else is looking into this.

epilk commented 10 months ago

Setting the type field to REQUIRED in LLParser.cpp causes ~150 lit test failures, but I think it shouldn't be too bad to update them programmatically. Should we start enforcing this property? I guess the alternative is to allow type to be optional/nullable in DISubprogram, it seems like the DWARF emitter can tolerate having this field missing (after guarding the place where it's currently crashing with an if (SP->getType() != nullptr). I'm totally new to debug info, so I'm not sure what the better option is!

dwblaikie commented 10 months ago

I guess maybe what I was thinking/alluding to in https://github.com/llvm/llvm-project/issues/59471#issuecomment-1349056404 was that possibly in some cases we legitimately produce DISubprograms without a type (maybe unprototyped functions in C?). If that were the case, maybe it'd show up in clang's test cases, rather than LLVM's - if we ran the verifier on clang's generated IR, which I believe we don't generally/as part of the normal flow of things.

So it might be worth tyring to see if there's a way to turn on the IR verifier for clang's IRGen pipeline and run check-clang and see if anything shows up?

Looking at only the lit test failures in https://github.com/llvm/llvm-project/issues/59471#issuecomment-1824971184 - checking if any of those are fully generated from clang, or possibly hand or machine reduced? if they're all hand/machine reduced, they might not be representative, but if some are full IR produced by clang, with comments describing the original source - maybe check if Clang still produces IR that looks like that & then we might know more about when this situation is normal and try to conclude what's different about the normal/accepted cases, and the ones you came across? It might be that a null type is only invalid in certain situations - and maybe we could catch those in the verifier without breaking many llvm tests, and without breaking clang.

epilk commented 10 months ago

Ok, thanks! Running check-clang with a verifier that checks for null type fields causes no issues (looks like the verifier is run by default). I noticed a few failing lit testcases that seem to be clang output (1, 2), but I can't seem to convince (even old) versions of clang to ever generate a null type, so my best guess is that these tests are slightly hand-reduced. Unprototyped C functions still have a DISubroutineType. So it seems like the only place null type fields are ever actually generated are in lit test cases. ISTM that we could get away with enforcing non-null types.

I guess the next question is should we? It seems the DWARF writer can almost handle a null type, and the DWARF5 standard says the DW_AT_type attribute is only required for non-void returning subprograms (DWARF5 3.3.2). I'm happy to implement which ever, I guess I'm leaning towards requiring the type to be present since that seems to be the LLVM representation, I just don't have a good justification for that! Do you have a stronger opinion either way?

1: https://github.com/llvm/llvm-project/blob/main/llvm/test/Transforms/WholeProgramDevirt/devirt-single-impl.ll#L43 2: https://github.com/llvm/llvm-project/blob/main/llvm/test/DebugInfo/MIR/X86/debug-call-site-param.mir#L111

pogo59 commented 10 months ago

the DWARF5 standard says the DW_AT_type attribute is only required for non-void returning subprograms (DWARF5 3.3.2).

Yeah, that has always been the case with regard to the DWARF description.

As far as the IR goes, I'd wonder how the various non-Clang front-ends to LLVM (Rust, Julia, flang, whatever) describe no-return-value functions; do they emit a type that's null, or not provide a type at all? (Does DIBuilder allow you not to provide a type? If DIBuilder makes you provide a type, then the textual IR should also require a type.)

dwblaikie commented 10 months ago

The difference is that the way we encode the debug info in the IR is that the type parameter of the DISubprogram is not the return type, but the whole function type. I think we carry a bunch of other stuff on that function type too - probably things like rvalue ref/lvalue overloading, const for const member functions, etc. (hmm, seems we do carry const there, but we redundantly encode rvalue-ness between the function type and a flag on the DISubprogram)

There's probably some tech debt that could be paid down in normalizing these things - but for now, I guess, making sure type is provided, and that it's a DISubroutineType, would probably be good.

pogo59 commented 10 months ago

The difference is that the way we encode the debug info in the IR is that the type parameter of the DISubprogram is not the return type, but the whole function type.

Aha. In that case I agree it's reasonable to consider a null type to be broken metadata, and make it mandatory.

dwblaikie commented 10 months ago

the type field of DISubprogram is a mirror of the signature of the function whose "scope" will be this very DISuprogramm, right? does it make sense not having the signature of the function?

Only place I could think of maybe we wouldn't have a subroutine type for a subprogram would be an unprototyped C function, but even then we still emit a subroutine type and use a flag (well, the absence of the prototyped flag) to specify that a function lacks a prototype. So, yeah, looking like subprograms should always have an associated subroutine type.

(I'm still not sure that clang does verify the IR when it's emitting it, but if you're pretty sure it does/that you've checked a fair amount of IR and nothing fails this verification - I'm OK with moving forward with making "subprograms must have types and they must be a subroutine type" a verifier check/failure)