Open llvmbot opened 5 years ago
Don't assume DSO-local also for Cygwin I made a patch according to Eli's suggestion in Comment 8. It seems to work for me.
echo "extern int a; int* f() { return &a; }" | clang -x c - -o - -S --target=x86_64-pc-mingw64
[...]
movq .refptr.a(%rip), %rax
[...]
.refptr.a:
.quad a
So it's using rip-relative addressing, but there's an extra level of indirection to allow it to point to an arbitrary address.
Make sure you're trying this with 8.0 or newer.
Thanks for the tip Eli, but I can't make it work.
When I try compiling with
clang --target=x86_64-pc-mingw64
on Linux clang or Cygwin clang, I get a 32-bit rip-relative address for external variables.
On Unix, clang currently doesn't really implement -mcmodel=medium
correctly: it essentially is just pretending you specified large
. That could be fixed, but it's not a high priority since almost nobody uses the medium code model, as far as I know.
Not exactly sure how that would apply to Windows, since the Unix medium
doesn't really do anything useful in a direct sense, but we could do something similar.
Your comment reminded me of what we currently do for MinGW. Sorry, I should have remembered this earlier. For those targets, we actually use a small
code model, but we have special handling for variables which we can't prove are DSO-local. You can try generating code with clang --target=x86_64-pc-mingw64
to see this. If we want cygwin to do the same thing, it's probably just a matter of going through the LLVM source code and replacing uses of isWindowsGNUEnvironment()
with isOSCygMing()
.
If you want to avoid unnecessary performance overhead in this setup, you can use ThinLTO (https://clang.llvm.org/docs/ThinLTO.html).
After a long debate on the cygwin mailing list and some reverse engineering I have come to the following conclusions:
Cygwin is using the medium memory model on Gcc and Clang (in 64 bit mode).
Cygwin is adding an extra link process when an EXE or DLL is loaded, in order to emulate the behavior of Linux shared objects. They call this pseudo-relocation, and it is undocumented. This emulation of shared objects includes the possibility to link directly to a variable in a different DLL. This requires a 64-bit address table (an extra GOT?). The medium model is necessary for using 64-bit addresses on external variables.
Cygwin works fine - and more efficiently - with a small memory model on any program that does not make direct access to a variable in a different DLL. The cygwin people are unwilling to reap this performance benefit.
Direct access to a variable in another DLL is arguably bad programming practice by modern standards, but it may occur in old Linux code. Cygwin is able to compile such old code.
The memory models work differently in Gcc and Clang. Gcc is using 64 bit address tables for external variables when compiling with a medium or large memory model. Clang is using 64 bit address tables on local static variables as well when compiling with a medium or large memory model. This includes floating point constants, string constants, array initializer tables, jump tables, and more. This is quite wasteful if not needed.
The small memory model works differently for different targets. It allows 32 bit absolute addresses on a Linux target, but not on a Windows or Mac target.
Can you please help clarify if the difference between Gcc and Clang for local static variables on the medium/large memory model is intended? If it is unintended, then Clang can be improved. If it is intended - and a good reason is given - then we can close this issue.
Whoever built the cygwin package must have modified the source code; there is no build option to change the default code model.
Currently, there isn't anyone in the LLVM community maintaining builds for cygwin. The most recent post I saw on llvm-dev is that LLVM doesn't build on a cygwin host. If someone wants to reach out to the LLVM community for help stabilizing the build for cygwin hosts/targets, I'd be happy to provide guidance, though.
Thanks for your comments.
The exact meaning of the various -mcmodel flags varies by target. Yes, apparently so, but this is undocumented. The SysV ABI says nothing about Windows, of course. And the Windows ABI says nothing about PIC.
There is no PIC concept in Windows. The COFF file format allows all addressing modes: 32-bit absolute, 32-bit relative, 32-bit image-relative (does not exist in ELF), 64-bit absolute, and more.
Compiling for PIC in Linux involves that all addresses go through a GOT or PLT. These don't exist in Windows.
I think it is a bad idea to turn off -mcmodel=medium
and large for Windows targets, because they may be useful for solving certain problems, and they definitely work.
The fact that -mcmodel=small
means something else in Windows needs to be documented.
Alternatively, I will propose to define a new value for mcmodel
that fits Windows. This model should limit the distance between code and static data to 2 GB so that relative addresses can be used, but allow locating a program at any address. This new memory model should not be called Windows, because it can be useful in both Linux, Mac OS, and Windows.
And what about Mac? I have not tried what -mcmodel=small
does with a Mac target.
Whether you decide to let -mcmodel=small
work differently for Windows targets - and document it - or define a new model, you need to tell the Cygwin people what to do. You cannot expect them to use the small model when the doc says that it allows 32 bit absolute addresses. Right now they are producing sub-optimal code with 64-bit addresses of all static data.
Also, looking briefly, there is no such thing as a "large" code model on Windows, because it's impossible to generate a binary that large anyway. https://docs.microsoft.com/en-us/cpp/build/x64-software-conventions?view=vs-2019 . We should probably reject -mcmodel=large on Windows targets, instead of using movabsq
.
The default code model is not something that can be configured with a build setting. Changing that would require directly patching the x86 backend.
In other words, -mcmodel=small works differently when a Windows target is specified.
Sort of.
The key difference is whether you're generating position-independent code. On Linux, PIC is disabled by default; you have to request it with -fPIC
or -fPIE
. On x86_64 Windows, -fPIC
is the default. IIRC, it's impossible to generate non-position-independent code on x86_64 Windows because the PE format doesn't support it.
The exact meaning of the various -mcmodel
flags varies by target. See https://github.com/hjl-tools/x86-psABI/wiki/x86-64-psABI-1.0.pdf for a description of the "Small code model" and the "Small position independent code model" on x86-64. We could probably improve the clang documentation here.
Thanks for the comment Eli.
I cannot find a documentation of -mcmodel=small
for Clang. The Gcc documentation says:
-mcmodel=small
Generate code for the small code model: the program and its symbols must be linked in the lower 2 GB of the address space. Pointers are 64 bits. Programs can be statically or dynamically linked. This is the default code model.
This does not fit Win64, where program code is not guaranteed to be loaded below 2 GB (though it often is). This difference is seen when a static array is accessed. Linux code uses a 32-bit absolute address with a scaled index:
movl myarray(,%rax,4), %eax
While Windows needs to make a pointer:
leaq myarray(%rip), %rcx
movl (%rcx,%rax,4), %eax
I tried this.
clang --target=x86_64-pc-cygwin -mcmodel=small
This makes the correct code for Windows, where the array is addressed with a pointer. Clang with a Linux target and -mcmodel=small
uses the 32-bit absolute address method.
In other words, -mcmodel=small
works differently when a Windows target is specified. Is this difference documented? If not, who can blame the Cygwin people for setting -mcmodel=large
or medium?
Cygwin clang does the right thing when I set -mcmodel=small
The correct target for Cygwin should be --target=x86_64-pc-cygwin
, I think.
Address generation is controlled by the mcmodel flag;
-mcmodel=small
forces the small code model (rip-relative), -mcmodel=large
forces the large code model (movabsq
). The default code model should be small for all Windows targets, including cygwin, as far as I know.
Windows binaries of llvm and clang are available from llvm.org.
llvm.org doesn't distribute cygwin binaries. If the cygwin clang is behaving differently from the llvm.org clang binaries, you'll have to ask the maintainer of the cygwin package.
Extended Description
Cygwin Clang v. 5.0.1 is using 64-bit absolute addresses when accessing static data in 64-bit mode. This is inefficient because it requires an extra 10-bytes long instruction for loading an address into a register every time it needs to access static data.
Linux Clang v. 6.0.0 with
--target=x86_64-win64-windows
gives the more efficient relative addresses, as do all other compilers.Is this a Cygwin-only issue? I posted it to the cygwin mailing list, but they advised me to go here.
Test case:
Cygwin Clang assembly output:
Linux Clang assembly output with windows target: