llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.01k stars 11.96k forks source link

lld crash while linking certain files with (thin)LTO: linking module flags 'override-stack-alignment': IDs have conflicting values #60310

Open sztomi opened 1 year ago

sztomi commented 1 year ago

While upgrading our toolchain from 11.x to 15.0.7, I ran into a crash in lld. Immediately before the crash, this error is displayed:

LLVM ERROR: Function Import: link error: linking module flags 'override-stack-alignment': IDs have conflicting values in '/home/tamas/.conan_plex/.conan/data/x264/161-1086f45-26/plex/stable/package/3657df0fb8418322c0d0e70c61c2a8e673a57092/lib/libx264.a(api.o at 291080)' and 'libavcodec/libx264.o-libx264_encoder.o'

Everything is built with the same toolchain (although not necessarily same flags - I'm still digging). The printed stacktrace is the following: https://gist.github.com/sztomi/88bbdaaad52ce325418216b6aee83c7e

The minimal invocation (apart from -v) that reproduces the crash is this:

ld.lld -v  -shared libx264.o-libx264_encoder.o -L. -lx264 

I don't know how to minimize the input files, so I'm uploading them here in case they are useful: https://drive.google.com/file/d/1KOuPYxJzjFCeHQHudWg9zNJ9j6zOoCx2/view?usp=share_link

I've built a debug version of lld and reproduced the crash and got a core dump. This is the unpacked apport output that was collected during the crash, so there are a couple of additional files that may be useful: https://drive.google.com/file/d/1Cu0pUE82ceTqzWkIggtaVAqmjKi-gdXP/view?usp=share_link

I realize this report is a bit incomplete / hard to repro with just depending on these files so any advice on what information I can collect would be appreciated.

sztomi commented 1 year ago

I'm seeing a stack-alignment mismatch between x264 and ffmpeg:

x264:

nasm -I. -I. -DARCH_X86_64=1 -I./common/x86/ -f elf64 -DSTACK_ALIGNMENT=64 -DPIC -o common/x86/quant-a-8.o common/x86/quant-a.asm -DBIT_DEPTH=8 -Dprivate_prefix=x264_8

whereas ffmpeg seems to pass -mstack-alignment=16 (these are the defaults in each build, I'm not changing them). Will look into harmonizing them and I think the crash will go away so that fixes my problem. But this should not crash I'm guessing. I have found a mention of a somewhat similar issue where the participants were speculating around the same flag: https://github.com/HandBrake/HandBrake/issues/4650#issuecomment-1311377354

DavidTruby commented 1 year ago

Do you think this might be related to #59521? We see runtime crashes in the compiled variable there, rather than build time crashes in lld, but it seems also due to LTO and stack alignment.

sztomi commented 1 year ago

@DavidTruby That's difficult to tell, unfortunately I don't know the inner workings of LTO. My gut feeling is that it's not related because this issue is a crash in lld itself whereas yours is a crash in the linked binaries (if I understand that correctly).

sztomi commented 1 year ago

Even though I don't see it in the stacktrace (it's probably inlined), but it looks like the error is coming from here in llvm/lib/Linker/IRMover.cpp:

    // If either flag has override behavior, handle it first.
    if (DstBehaviorValue == Module::Override) {
      // Diagnose inconsistent flags which both have override behavior.
      if (SrcBehaviorValue == Module::Override &&
          SrcOp->getOperand(2) != DstOp->getOperand(2))
        return stringErr("linking module flags '" + ID->getString() +
                         "': IDs have conflicting override values in '" +
                         SrcM->getModuleIdentifier() + "' and '" +
                         DstM.getModuleIdentifier() + "'");
      continue;
    } else if (SrcBehaviorValue == Module::Override) {
      // Update the destination flag to that of the source.
      overrideDstValue();
      continue;
    }

And I'm guessing it's a (for me) new error because this made the stack alignment a module attribute that is checked in the IRLinker. What I don't know is, is it really an issue to link code that internally uses a different stack alignment with code that uses another? This used to work before the attribute existed.

Edit: rather than forcing a certain alignment on either ffmpeg or x264, I ended up reverting the change that makes the alignment a module attribute in our tree. So my issue is fixed - but I think this error should not lead to crash in the first place (and I'm not sure why it does, just looking at the code). Arguably the check is too strict, too? At least there could be an opt-out perhaps.