Open llvmbot opened 5 years ago
I think some of the steps Andy is describing need to implemented in LLVM's JIT code. I don't think you can take them all as a user. I don't think anyone is actively working on making the JIT handle COFF right now, unfortunately.
So... All I can do is wait? :0 I'm not expierenced enough to do the changes myself - as mentioned I'm just a user of the LLVM. Currently we work around this issue by using the clang compiler but this will not help with object or libary files that are not from us.
Is there a way to manages this with the LLVM as an 'user' - or do I have to change the source code of the LLVM?
I think some of the steps Andy is describing need to implemented in LLVM's JIT code. I don't think you can take them all as a user. I don't think anyone is actively working on making the JIT handle COFF right now, unfortunately.
Unfortunately it's been a quite while since I tried this, so I don't remember the details anymore. I recall making some changes in windows relocation processing, so if you look for those you may get some clues.
Is there a way to manages this with the LLVM as an 'user' - or do I have to change the source code of the LLVM?
_ImageBase is set to the base address of the executable, so that (among other things) jump table offsets can be encoded compactly using addr32nb relocations.
EG if you dump the relocations in your text section you see:
RELOCATIONS #6 Symbol Symbol Offset Type Applied To Index Name
0000000C REL32 00000000 8 ?myInt@@3HA (int myInt) 00000013 REL32 00000000 59 __ImageBase 00000026 REL32 00000000 55 ??_C@_0L@KCGKBKCO@?$CFllu?4?$CJ?5?$CFi?6?$AA@ (`string') 00000031 REL32 00000000 1C printf 00000038 REL32 00000000 8 ?myInt@@3HA (int myInt) 00000045 ADDR32NB 00000000 2C $LN22 0000004C ADDR32NB 00000000 2D $LN23 0000008A REL32 00000000 8 ?myInt@@3HA (int myInt) 00000090 REL32_1 00000000 B ?Initialized@@3_NA (bool Initialized) 000000A8 ADDR32NB 00000000 2E $LN9 000000AC ADDR32NB 00000000 2F $LN10 000000B0 ADDR32NB 00000000 30 $LN11 000000B4 ADDR32NB 00000000 31 $LN12 000000B8 ADDR32NB 00000000 32 $LN13 000000BC ADDR32NB 00000000 33 $LN14 000000C0 ADDR32NB 00000000 34 $LN15 000000C4 ADDR32NB 00000000 35 $LN16
Here the _ImageBase and ADDR32NB relocations must be resolved in a consistent fashion, so that a sequence like the following works:
lea rdi, OFFSET FLAT:__ImageBase movzx ecx, BYTE PTR $LN22@Initialize[rdi+rax] ; ADDR32NB mov edx, DWORD PTR $LN23@Initialize[rdi+rcx*4] ; ADDR32NB add rdx, rdi jmp rdx
Since you're not building an executable you need to emulate handling these sorts of executable-related relocations.
For example: ensure that all the constituent loadable parts of the object are placed within a 4GB range (as they would be if they were part of an executable). Then resolve _ImageBase to the address of the lowest loaded part, and resolve each ADDR32NB as the delta between the relocation target and _ImageBase.
I know there are people who work on Windows/COFF, and I know there are people who work on MCJIT, but I don't know if there's anyone who works on both. Adding a couple of random people though just in case someone else does happen to know.
Extended Description
Hello LLVM-Team,
I used the new LLVM 7 to write a small and simple JIT-Client, which loads bitcode files, JITs them and executes them. In this JIT process I also include some object files which were generated by VisualStudio2017 - but sadly the resulting code will crash. I did some research and try to explain what I've done and what my conclusions are.
1.) Generating VisualStudio object file All I do is simply compile the file "CM_Switch.cpp" - as it is attached to this report - and that's all. I use the following compile flags: /nologo /FAcs /Zc:wchar_t- /GS- /MT /W3 /O2 /I "....\include" /I "....\external\include" /D "WIN32" /D "_CRT_NON_CONFORMING_SWPRINTFS" /D "_CRT_NONSTDC_NO_DEPRECATE" /D "_CRT_SECURE_NO_WARNINGS" /D "NDEBUG" /D "_CONSOLE" /D "_MBCS" /Fp"$(OutDir)%(Filename).pch" /Fo"$(OutDir)%(Filename).obj" /c $(ProjectName).cpp
2.) JIT Client For the JIT client I use to parse first a bc file, that does not contain any code - I just compiled an empty .cpp document with clang and enabled generating a bc file. So the bc file is not empty, but has no executable code or anything. After this I locate the CM_Switch.obj file and add it via "addObjectFile": llvm::Expected<std::unique_ptr> preObj = llvm::object::ObjectFile::createObjectFile(ArBuf.get()->getMemBufferRef());
refEngine->addObjectFile(llvm::object::OwningBinary(std::move(preObj.get()), std::move(ArBuf.get())));
When generating the executable code, the JIT client will ask for resolving some references and will get these address as they are. But executing the "Initialize2" function will crash the application.
Investigations: With the CM_Switch.cod file and a debugger I was able to locate the root of the problem! Assembly instructions like these: lea r8, OFFSET FLAT:__ImageBase
The problem comes from "OFFSET FLAT" which - as I understood - determine the offset of the current instruction to that reference. In this case "ImageBase". But this is not handled correct! When I pass an address to "ImageBase", the application will crash at EXACTLY the address I passed. When I return 0xFF as an Address, I will crash at the address 0xFF, if I pass the address of ImageBase, I will crash there. If I pass an address to a function, then this function will actually be executed. It seems to me, that this code gets replaced with a jump, which is totally wrong.
That is all I can say.