Open kraj opened 5 years ago
Still crashing as of post-17 trunk: https://godbolt.org/z/7beb6xYrh I wonder if this is yet another instance of disabling SSE2 when its registers are required by x86_64 calling convention. Reduced by C-Reduce:
void print_diff_percent() {
__pr_out("%s%+7.2f%s%%", "", 999.99, "");
}
@llvm/issue-subscribers-backend-x86
CC @phoebewang
I wonder if this is yet another instance of disabling SSE2 when its registers are required by x86_64 calling convention.
I think so. GCC will still generate SSE2 instruction even though SSE2 is disabled. I think we just need a better diagnosis instead of crash directly.
CC @AaronBallman
Oh this is neat. https://godbolt.org/z/hTK96nWaT
Not asserting: -x c -Xclang -target-feature -Xclang -sse2
Asserting: -x c -Xclang -target-feature -Xclang -sse2 -Wno-implicit-function-declaration
So I think the issue is that the implicit function declaration is a K&R C function, so when we allow the definition to be implicitly declared, we trigger the backend assertion.
I wonder if the best approach here is to make it an error to attempt to disable sse2 (or any target features) that are required for ABI purposes for any given target? CC @MaskRay for opinions
This is likely a duplicate of #29774. I have a reply at https://github.com/llvm/llvm-project/issues/29774#issuecomment-1714348642
I do wish that we can reject -mno-sse
-mno-sse2
for x86-64, but the options are misused (and even by Linux kernel) and the projects may complain.
This is likely a duplicate of #29774. I have a reply at #29774 (comment) I do wish that we can reject
-mno-sse
-mno-sse2
for x86-64, but the options are misused (and even by Linux kernel) and the projects may complain.
I think people misusing the options is exactly why we should start to diagnose it (we've already gotten multiple reports from users about crashing, so this is something people are hitting repeatedly). Is there a way we can do a gentle introduction (like warn "this combination of options will be a hard error in Clang N") so that we can start moving the needle in the correct direction? Could @nickdesaulniers or @nathanchance help us with the Linux kernel side of things?
Could @nickdesaulniers or @nathanchance help us with the Linux kernel side of things?
I could help facilitate discussion with the x86 maintainers around this but without an equivalent way to ensure no FP code gets generated for the kernel, I suspect they won't be too receptive to killing this combination of options. I had to dig quite a bit to find out what change added -mno-sse -mno-sse2
to the x86_64 Makefile, as it is older than the kernel's git history... but here it is, with no much additional information to go off of :/ If a discussion would be helpful, I can kick it off if I have a list of people to rope in on the LLVM side.
I wonder if the best approach here is to make it an error to attempt to disable sse2 (or any target features) that are required for ABI purposes for any given target?
I do wish that we can reject -mno-sse -mno-sse2 for x86-64, but the options are misused (and even by Linux kernel) and the projects may complain.
The kernel's intent is to avoid having the compiler generate code using SIMD or FP registers. The Linux kernel generally does not use FP/SIMD as doing so adds overhead to pre-emption of kernel threads (and generally doesn't need FP support). There are a few exceptions.
So if clang were to reject -mno-sse -mno-sse2
for x86_64 while GCC doesn't AND while not providing the equivalent flag for "no FP/SIMD codegen please" then yeah that doesn't work for the Linux kernel's use case.
I wonder if this is yet another instance of disabling SSE2 when its registers are required by x86_64 calling convention. Reduced by C-Reduce:
If the user specifies every -mno- flag in the book, then tries to use floating point on x86, clang should diagnose that. That is the correct fix here IMO.
// $ clang -mno-sse2
void print_diff_percent() {
__pr_out("%s%+7.2f%s%%", "", 999.99, "");
^ error: calling convention requires sse2 for argument of type double, but -mno-sse2 set.
}
If the user specifies every -mno- flag in the book, then tries to use floating point on x86, clang should diagnose that. That is the correct fix here IMO.
I agree, but I don't think Clang has the information it needs to be able to diagnose this in general. Making it a backend diagnostic would perhaps be a good option if we had reasonable source location fidelity, but AFAIK, that's still an unsolved issue in LLVM.
The part I keep running around in my head is that -target x86_64 -mno-sse2
isn't a valid combination for C or C++; it's a language dialect. There is no such thing as x86-64 without SSE2, but there is still a valid need for x86-64 that doesn't generate fp or simd instructions. -m
flags are used for machine-dependent functionality and -f
flags are for machine independent functionality; but this need splits the middle in some ways which makes it awkward. But I can see the logic of -target x86_64 -mno-sse2
being the right way to tell the compiler "I want to target x86-64 but without these kinds of instructions". The alternative would be adding language dialect flags that are almost the same thing as what our existing -m
flags already do, but making it more the frontend's concern than the backend's.
uftrace
also uses -mno-sse2
to prevent compiler from using FP/SIMD registers.
The use case is to capture FP/SIMD register values when it traces functions. Basically uftrace
(actually libmcount.so
which is preloaded for the target binary) intercepts mcount
calls and it can capture (original) function's arguments.
For FP/SIMD arguments, it needs to read the register value when it records the trace data. It may save all register values (including FP/SIMD) at the beginning of mcount
and read the saved values instead, but it doesn't do that for a performance reason. Instead libmcount
doesn't use any FP/SIMD types internally (and hopefully not in a handful of functions in libc/libdl/libpthread it calls) and only accesses the FP/SIMD registers in inline asm for recording. So it can skip saving and restoring FP/SIMD registers for every function call.
Extended Description
attached sample makes clang crash