Closed mvhooren closed 2 years ago
mentioned in issue llvm/llvm-bugzilla-archive#44700
Great! Thank you for checking this out Machiel.
Closing as fixed by 84217ad6611.
I must add that I do not use LLJIT but my own JIT implementation using ORC.
Hi Lang,
I've tested my repro case on Windows with both debug and release builds without issue, using the unmodified current master branch.
I have also tested the IR posted in http://bugs.#44337 by loading the IR from a file, adding the module to a dylib, then getting the address to the 'calculate' function and executing it. This also works without problems, although there is an access violation when executing the compiled IR due to the usage of @cachedValue. (store double %4, double* @cachedValue, align 8). Removing that line will produce valid output.
Machiel
Hi Machiel,
Can you confirm that 84217ad66115cc31b184374a03c8333e4578996f fixes this bug?
It looks like this might be related to http://bugs.#44337 -- I want to make sure I figure out what has been fixed and what's still broken.
-- Lang.
Setting priority back down. This isn’t ideal, but it’s not a release blocker: ORCv1 and MCJIT are still both in LLVM10.
....
Machiel — Could you see if this patch fixes your issue? Unfortunately I don’t have a windows machine to test execution of COFF objects.
— Lang.
Roger on it not being a release blocker. I'm not really familiar with the criteria for what should warrant a release blocker, sorry about that.
I have tested your patch and everything seems to work fine so great success! I've also verified that the issue still exists in unpatched ORC.
Setting priority back down. This isn’t ideal, but it’s not a release blocker: ORCv1 and MCJIT are still both in LLVM10.
Confirming the earlier analysis: __real@40000000 is (a COMDAT “any” symbol) for a constant pool entry that is created by MC. We’re failing when two materialization units create the same symbol late in the pipeline and both try to register it.
An ideal solution to this would involve teaching the JIT, libObject, and the JIT-linker about COMDAT symbols. That’s not going to happen for a while though.
In the mean time I have attached a patch for an possible workaround. This patch modifies defineMaterializing to support defining new weak symbols (we’ll want to take this part either way), then adds a hack to RTDyldObjectLinkingLayer to detect COFF objects, look for newly defined symbols, and then mark any newly defined symbols in COMDAT sections as weak.
Machiel — Could you see if this patch fixes your issue? Unfortunately I don’t have a windows machine to test execution of COFF objects.
— Lang.
I have marked this bug as a release blocker because it causes a crash when using ORC on Windows in a very trivial case.
Hi Lang,
Yes, on ORCv2 it prints the same error message to the console and then crashes shortly afterwards. My target tripple is x86_64-pc-windows-msvc.
The crash happens in RTDyldObjectLinkingLayer::onObjEmit when it tries to destroy a MemoryBuffer (the ObjBuffer).
Let me know if you need anything else. I could create a minimal repro case using the Kaleidoscope examples if you want.
Hi Machiel,
Oh -- this is awful. :)
Does it manifest with the same error on ORCv2? Or with a different error?
The relevant parts of RTDyldObjectLinkingLayer/RuntimeDyld's algorithm look something like this:
(1) scan the object file, noting weak definitions in a side table (2) find out which weak definitions we are responsible for (3) perform the link (4) if auto-claim is on, add all provided definitions to the definition set (5) register definitions
The issue likely has something to do with the fact that determine responsibility for materializing weak defs in (2), but don't claim it until (4). Though I would have expected this to lead to a missing definition error, rather than a duplicate.
We probably need to add a new API to MaterializationResponsibility: "defineMaterializingWeak". Whereas "defineMaterializing" generates an error on duplicates, "defineMaterializingWeak" would silently continue without adding duplicates to the symbol table.
Could you include the target triple that you're seeing in your test cases so that I can make sure I match your setup?
This bug is still reproducible with ORCv2
Extended Description
It happens when the same literal is defined twice, each in a different module and after the second module has been added to the RTDyldObjectLinkingLayer.
So for example in some module:
define float @someFunction() { entry: ret float 2.000000e+00 }
In some other module: define float @someOtherFunction() { entry: ret float 2.000000e+00 }
When materializing the second function, you will get this error:
JIT session error: Duplicate definition of symbol '__real@40000000'
The symbol names come from comdat symbols that are created in TargetLoweringObjectFileImpl.cpp TargetLoweringObjectFileCOFF::getSectionForConstant
For COFF, I have the OverrideObjectFlagsWithResponsibilityFlags and AutoClaimResponsibilityForObjectSymbols set to true on the RTDyldObjectLinkingLayer or my function symbols are not found at all.
The same code compiles and runs just fine on Linux/Elf, even with the COFF workarounds enabled there as well.
A workaround I found is to set HasCOFFComdatConstants to false in MCAsmInfoCOFF.cpp