ldc-developers / ldc

The LLVM-based D Compiler.
http://wiki.dlang.org/LDC
Other
1.21k stars 261 forks source link

Error: piece is larger than or outside of variable call void @llvm.dbg.value(metadata i64 %29, i64 0, metadata !1479, metadata !7942), !dbg !7923 #1933

Closed ximion closed 7 years ago

ximion commented 7 years ago

Using LDC 1.1 Beta 6, all packages from Debian Unstable.

When compiling a certain part of Vibe.d, LDC fails with

piece is larger than or outside of variable
  call void @llvm.dbg.declare(metadata %router.MatchGraphBuilder.TerminalTag* %.frame.sroa.3, metadata !106, metadata !400), !dbg !401
!106 = !DILocalVariable(name: "builder", scope: !105, file: !4, line: 18, type: !51)
!400 = !DIExpression(DW_OP_bit_piece, 64, 192)
LLVM ERROR: Broken module found, compilation aborted!

A minimal testcase for this bug (likely a regression since this worked before) is attached. testcase.tar.gz

This issue only happens when compiling with optimization.

kinke commented 7 years ago

Thanks for the nice small testcase! I don't have time for this though atm...

JohanEngelen commented 7 years ago

I cannot reproduce this on OS X. What LLVM version is LDC built with? (please paste the full output of ../bin/ldc2 --version)

thewilsonator commented 7 years ago

I also couldn't repro on OS X.

ximion commented 7 years ago

I am currently at work, but I have the same Debian system here, so except for the CPU the version output should be the same:

LDC - the LLVM D compiler (1.1.0):
  based on DMD v2.071.2 and LLVM 3.8.1
  built with LDC - the LLVM D compiler (1.0.0)
  Default target: x86_64-pc-linux-gnu
  Host CPU: ivybridge
  http://dlang.org - http://wiki.dlang.org/LDC

  Registered Targets:
    aarch64    - AArch64 (little endian)
    aarch64_be - AArch64 (big endian)
    amdgcn     - AMD GCN GPUs
    arm        - ARM
    arm64      - ARM64 (little endian)
    armeb      - ARM (big endian)
    bpf        - BPF (host endian)
    bpfeb      - BPF (big endian)
    bpfel      - BPF (little endian)
    cpp        - C++ backend
    hexagon    - Hexagon
    mips       - Mips
    mips64     - Mips64 [experimental]
    mips64el   - Mips64el [experimental]
    mipsel     - Mipsel
    msp430     - MSP430 [experimental]
    nvptx      - NVIDIA PTX 32-bit
    nvptx64    - NVIDIA PTX 64-bit
    ppc32      - PowerPC 32
    ppc64      - PowerPC 64
    ppc64le    - PowerPC 64 LE
    r600       - AMD GPUs HD2XXX-HD6XXX
    sparc      - Sparc
    sparcel    - Sparc LE
    sparcv9    - Sparc V9
    systemz    - SystemZ
    thumb      - Thumb
    thumbeb    - Thumb (big endian)
    x86        - 32-bit X86: Pentium-Pro and above
    x86-64     - 64-bit X86: EM64T and AMD64
    xcore      - XCore

EDIT: I can reproduce the bug on this system.

JohanEngelen commented 7 years ago

Reproduced with LLVM 3.8, the bug is fixed in LLVM 3.9.

Further simplified the testcase: ldc2 -O1 -g -release -c gh1933.d

struct A
{
}

A[] a;
void rebuildGraph()
{
    import std.algorithm;

    MatchGraphBuilder builder;
    uint process(size_t n)
    {
        foreach (t; builder.m_nodes[n].terminals)
            assert(a.canFind!(u => t.index));
        return 1;
    }
}

struct MatchGraphBuilder
{
    struct TerminalTag
    {
        uint index;
        string var;
    }

    struct Node
    {
        TerminalTag[] terminals;
    }

    Node[] m_nodes;
}
ximion commented 7 years ago

While LLVM 3.8 is default in Debian, we can compile LDC with LLVM 3.9 explicitly if that helps.

JohanEngelen commented 7 years ago

Yeah. (I don't think it is smart to spend our limited time on trying to figure out how to work around a bug in a (not so) old LLVM version)

kinke commented 7 years ago

I don't think it is smart to spend our limited time on trying to figure out how to work around a bug in a (not so) old LLVM version

Definitely not smart, especially for an issue only applying to optimized builds with debuginfos.

ximion commented 7 years ago

We now build LDC against LLVM 3.9 in Debian, and I can confirm that this resolves the issue.

Since we know about this bug and will not have a workaround, maybe the second best option is to raise the minimum required version of LDC or at lest print a very big warning when someone has an old LLVM. (I would go with making the build fail though, otherwise a message like this would get missed too easily).

kinke commented 7 years ago

Well it appears to be a LLVM bug, and it's not the only one and won't be the last. Dropping support for LLVM 3.5-3.8 just because of this is not an option; there are interesting LLVM forks lagging behind (e.g., Khronos' SPIR-V for OpenCL, which is based on LLVM 3.5 afaik), and maintaining the older versions hasn't gotten extremely tedious yet. LDC has enough issues :], one more due to LLVM won't change the world. I'm not even sure the crash occurs if LLVM assertions are disabled (in that case, at least the debugger may have a hard time...).

ximion commented 7 years ago

Still, it's a bug that people will run into and then will waste time to find solutions for... (wanting to compile Vibe.d with optimizations isn't a very unlikely event) That's a pretty unfortunate state.

Looks like the LDC build with bumped LLVM fails on ppc64el... Although it doesn't look like being related to LLVM, something else is going on... https://buildd.debian.org/status/fetch.php?pkg=ldc&arch=ppc64el&ver=1%3A1.1.0%2Bb6-2&stamp=1482261468

ximion commented 7 years ago

Bah, looks like even with LDC compiled against LLVM 3.9 this issue still persists, only the testcase I provided works. Looks like I need to fire Dustmite at the segment again...

JohanEngelen commented 7 years ago

blegh.

Much obliged.

JohanEngelen commented 7 years ago

btw, perhaps my comments were misleading, but I think it's still an open question whether this an LLVM bug or an LDC bug

ximion commented 7 years ago

Here is a new testcase - I only had time today to run Dustmite on the project again.

LDC - the LLVM D compiler (1.1.0):
  based on DMD v2.071.2 and LLVM 3.9.1
  built with LDC - the LLVM D compiler (1.1.0)
  Default target: x86_64-pc-linux-gnu
  Host CPU: haswell

testcase2.tar.gz

JohanEngelen commented 7 years ago

testcase2 also crashes with LLVM trunk

JohanEngelen commented 7 years ago

Minimized testcase a little:

struct MatchTree
{
    struct Tag
    {
    }

    struct Terminal
    {
        string varNames;
    }

    Tag[] m_terminalTags;
    Terminal term;

    void rebuildGraph()
    {

        MatchGraphBuilder builder;
        uint process()
        {
            auto aaa = m_terminalTags.length;
            foreach (t; builder.m_nodes)
            {
                import std.algorithm;

                auto bbb = term.varNames.countUntil(t.var);
                assert(m_terminalTags.canFind!(u => t.index));
            }
            return 0;
        }
    }
}

struct MatchGraphBuilder
{
    struct TerminalTag
    {
        size_t index;
        string var;
    }

    TerminalTag[] m_nodes;
}

ldc2 -c -g -O3 -release router.d

kinke commented 7 years ago

Note that the error can be suppressed via -disable-verify and compilation works fine.

The IR for router.MatchTree.rebuildGraph.process() diverges hugely between -O0 and -O1 (suffices to trigger the error). Specifically, all 4 remaining llvm.dbg.declare() intrinsic calls with -O1 use an undef alloca as 1st argument. And with -O0, there's no single llvm.dbg.value() call. What probably plays a role is that we have 2 nested frames here, one for rebuildGraph() (this and builder captured) and one for process() (loop variable t captured by lambda for canFind()).

We use a GEP as 1st arg when llvm.dbg.declare()ing t (2nd field in nested process() frame, after a pointer to the rebuildGraph() frame), and that's no alloca, and I recall some discussion about GEPs not being supported in a rather recent debuginfo PR. As a side note, for the first field of a frame, we use the frame alloca directly. So the way to go is probably using the frame alloca, but telling LLVM somehow else that an offset is to be applied.

JohanEngelen commented 7 years ago

relevant: https://github.com/ldc-developers/ldc/pull/1646

JohanEngelen commented 7 years ago

and: https://github.com/ldc-developers/ldc/pull/1598

JohanEngelen commented 7 years ago

Maybe I know what's going on. We are defining a debug location with an offset, but... LLVM/Dwarf wants the offset in bytes, and we give an offset as a struct field index do calculate the offset within the struct correctly, so I no longer understand what's going wrong.

  const auto idx = irLocal->nestedIndex;
  assert(idx != -1 && "Nested context not yet resolved for variable.");

  if (dwarfValue && global.params.symdebug) {
    gIR->DBuilder.OpOffset(dwarfAddr, val, idx);
  }

If I apply the same type of fix as #1646 , things no longer crash. I'll submit a PR so you can look at it, and to see if the CI tests pass.

kinke commented 7 years ago

Great, so it's exactly the other way around, GEPs all the way. :)

kinke commented 7 years ago

OpOffset() forwards to a version looking up the field offset in bytes. If it's not bits it wants, it may just be a LLVM bug/limitation wrt. Dwarf ops. Edit: Well, what we do here is push DW_OP_plus, <offset in bytes>. Not sure if the size needs to be adjusted as well (DW_OP_bit_piece).

JohanEngelen commented 7 years ago

It's the same bug that I worked around in #1646 , perhaps... documentation is very scarce :(

JohanEngelen commented 7 years ago

@ximion With the changes in #1963 , are you able to compile vibe.d as you originally wanted? Thanks for testing.

kinke commented 7 years ago

Should be fixed by #1963.

ximion commented 7 years ago

Yes, everything is working fine and Vibe.d compiles without issues. Thank you! :-)