Open Quuxplusone opened 7 years ago
Bugzilla Link | PR35508 |
Status | REOPENED |
Importance | P enhancement |
Reported by | Martin O'Riordan (martino@theheart.ie) |
Reported on | 2017-12-02 20:33:07 -0800 |
Last modified on | 2017-12-18 02:11:51 -0800 |
Version | 5.0 |
Hardware | PC All |
CC | codeman.consulting@gmail.com, dimitry@andric.com, efriedma@quicinc.com, llvm-bugs@lists.llvm.org, matze@braunis.de, paul_robinson@playstation.sony.com |
Fixed by commit(s) | |
Attachments | |
Blocks | |
Blocked by | |
See also |
Since you are assigning a null pointer to b, the operation "int c = *b" is
undefined, and basically everything after that is undefined. So clang decides
to optimize the whole loop away.
If you really want to dereference the null pointer anyway (which will crash at
runtime), try using "volatile int *" instead.
You're absolutely right - I should've spotted that as being a UB issue.
Thanks, MartinO
In my version of Clang I get the following:
Z:\Code>clang --version
clang version 5.0.0 (https://github.com/llvm-mirror/clang.git
adf03c776be767d5d56fdb20c56ec4f6df0b33e6) (https://github.com/llvm-
mirror/llvm.git 7bfd7c00d76359356c3572222f33b03931972c9f)
Z:\Code>clang -S -O1 nocode.cpp -o nocode2.s
Z:\Code>type nocode2.s
.text
.def "?func@@YAXPEAH@Z";
.scl 2;
.type 32;
.endef
.globl "?func@@YAXPEAH@Z"
.p2align 4, 0x90
"?func@@YAXPEAH@Z": # @"\01?func@@YAXPEAH@Z"
# BB#0:
ud2
Which makes sense and seems correct at first glance with the undefined
instruction replacing the incorrect pointer code. However, I get a completely
different result following a run through LLC at O1:
Z:\Code>clang -O0 nocode.cpp -emit-llvm -c
Z:\Code>llc -filetype=asm nocode.bc -O1 -o nocode1.s -x86-asm-syntax=intel
Z:\Code>type nocode1.s
.text
.intel_syntax noprefix
.def "?func@@YAXPEAH@Z";
.scl 2;
.type 32;
.endef
.globl "?func@@YAXPEAH@Z"
.p2align 4, 0x90
"?func@@YAXPEAH@Z": # @"\01?func@@YAXPEAH@Z"
.Lcfi0:
.seh_proc "?func@@YAXPEAH@Z"
# BB#0:
sub rsp, 24
.Lcfi1:
.seh_stackalloc 24
.Lcfi2:
.seh_endprologue
mov qword ptr [rsp + 16], rcx
mov qword ptr [rsp + 8], 0
mov dword ptr [rsp + 4], 0
cmp dword ptr [rsp + 4], 15
jg .LBB0_3
.p2align 4, 0x90
.LBB0_2: # =>This Inner Loop Header: Depth=1
mov rax, qword ptr [rsp + 8]
mov eax, dword ptr [rax]
mov dword ptr [rsp], eax
mov rax, qword ptr [rsp + 16]
lea rcx, [rax + 4]
mov qword ptr [rsp + 16], rcx
mov ecx, dword ptr [rsp]
sub ecx, dword ptr [rax]
mov dword ptr [rsp], ecx
mov rax, qword ptr [rsp + 8]
lea rdx, [rax + 4]
mov qword ptr [rsp + 8], rdx
mov dword ptr [rax], ecx
inc dword ptr [rsp + 4]
cmp dword ptr [rsp + 4], 15
jle .LBB0_2
.LBB0_3:
add rsp, 24
ret
.seh_handlerdata
.text
.Lcfi3:
.seh_endproc
Same at O2 or O3.
In this case *b is generated as a local stack variable at [rsp+8], followed by
j at [rsp+4] and finally c at [rsp]. The value of *z is unknown, but c is an
obvious null ptr deref. For some reason the LLC build doesn't recognize this
as a null ptr derefence and generates the rest of the loop.
Now my question is, shouldn't the assembly output of llc -O1 on bitcode be the
same as clang -O1 on the .cpp file? On windows they don't at least. That
seems like a actual bug to me, in theory we should get nearly identical code
from both methods of doing this. Even at -O3 LLC doesn't reduce to UDF.
My opinion on this particular report is mixed; by reducing to UDF we lose the
ability for a null ptr dereference exception to be thrown, and hence debugging
info for the problem even though it's obvious. I think that's at least worth
looking at; I remember a discussion about it before but if the code is
invalid, why not just error out compiling when the null ptr deref is hit and
diagnose it there rather than create an executable where the problem can no
longer be easily solved?
In addition, clang -O1 -m32 produces the following instead:
Z:\Code>type nocode5.s
.text
.def @feat.00;
.scl 3;
.type 0;
.endef
.globl @feat.00
@feat.00 = 1
.def "?func@@YAXPAH@Z";
.scl 2;
.type 32;
.endef
.globl "?func@@YAXPAH@Z"
.p2align 4, 0x90
"?func@@YAXPAH@Z": # @"\01?func@@YAXPAH@Z"
# BB#0:
nop # avoids zero-length function
.def _main;
.scl 2;
.type 32;
.endef
.globl _main
.p2align 4, 0x90
...
Now that situation is just completely incorrect, as the NOP is executed and the
debugger breaks in main with an invalid dereference of argv** because the
function it was calling just falls back through to it. Even the UDF is better
than that and I think this is an actual bug if it's still happening in ToT. My
windows build isn't up to date at the moment but I'm reopening because of that
last example.
BTW just for comparison MSVC's clang-cl and regular cl are both able to debug
this correctly and do not reduce the loop. In debug mode the null pointer is
caught when b is dereferenced, in release without a PDB the null pointer
exception is thrown on, presumably because of lack of proper debug info, but it
still breaks in that loop.
c -= *z++;
Either way it's clear from the debug where the general problem is; the clang
output at -m32 gives only a hint since func is just a nop.
I know all of that is nitpicky, but at least the UDF makes it obvious that there's a problem and it's in func(). The NOP is just plain incorrect output IMO.
Clang's -O1 and llc's -O1 are not the same. llc uses an optimization
pipeline defined by LLVM, while Clang constructs its own pipeline.
So, seeing different output is not unexpected. It does seem odd that
llc did not optimize away the loop and therefore the function, that
might be worth investigating.
The emitted 'ud2' instruction is something that happens when you have
a Windows-target compilation, it replaces 'unreachable' with the trap
instead of optimizing it away completely.
Why optimize away the function without emitting an error? The function
has undefined behavior... but only if it is actually called! Clang
cannot emit an error here, because the UB might not occur, and Clang
does not have enough knowledge to say definitively one way or the other.
And LLVM cannot be sure where the 'unreachable' was introduced, as it
might have come from a previous pass that did some analysis to show
that a particular path happens to be unreachable in a particular context.
This might be due to optimization transforms and not reflect true C/C++
undefined behavior in the source program.
I agree it would be lovely if LLVM had better diagnostics in this area.
But even if it did, they would never be perfect, and so static analysis
and the sanitizers are your best friends here.
(In reply to Paul Robinson from comment #6)
> Why optimize away the function without emitting an error? The function
> has undefined behavior... but only if it is actually called! Clang
> cannot emit an error here, because the UB might not occur, and Clang
> does not have enough knowledge to say definitively one way or the other.
> And LLVM cannot be sure where the 'unreachable' was introduced, as it
> might have come from a previous pass that did some analysis to show
> that a particular path happens to be unreachable in a particular context.
> This might be due to optimization transforms and not reflect true C/C++
> undefined behavior in the source program.
>
> I agree it would be lovely if LLVM had better diagnostics in this area.
> But even if it did, they would never be perfect, and so static analysis
> and the sanitizers are your best friends here.
So, this is definitely not a CLang problem, and 'scan-build' does catch it.
Programmers who routinely use the static analysers would have detected and
corrected the problem long before I say it reported as a bug.
I fully accept that this is UB and that what actually happens is decidedly
undefined :-)
However, the UB I got was very, very strange. The code that was being compiled
was also using '-ffunction-sections', so each function was in it's own section.
When the function 'func' was called, the function that was alphabetically next
is what was actually called, and it looked extremely like the type of bugs you
get when a C++ virtual function is called with a corrupt or out-of-date Vtable,
or other indirect call related problem.
In this particular example, the MBB (in LLVM) after optimisation has an empty
sequence of BBs, so the function entry-point is emitted, but there is nothing
to follow.
What I am looking into (in our target) is to detect an empty BB list, and emit
a simple function return (or suppress the label emission), give a diagnostic
(words TBD), and exit with a non-zero exit code. This will allow the
programmer to realise that something is not right.
But the silent emission of an orphaned label while technically a valid
interpretation of UB (and I'm as guilty as any for taking advantage of UB), the
fact remains that it is mind-bogglingly complicated for the programmer to
realise what has caused it. It is also important that the programmer should be
able to diagnose UB artefacts without excessive investigation.
The simple test case I provided was distilled from nearly 10,000 lines of code,
so even for the compiler writer it was a complicated problem to identify. I
ended up reducing it, but the UB still caught me out.
In LLVM, I can think of no legal circumstance where the list of BBs for a
function would ever be empty, so detecting this and emitting an error message
of some kind seems like a reasonable thing to do.
So now it move from UB (Undefined Behaviour) to QoI (Quality of Implementation).
MartinO
Also note that llc
does not add any bitcode optimization passes, that's a job for the opt
tool.
Emitting a ud2 or equivalent would make sense; falling off the end of a function is very confusing to debug.
We really don't want to emit warnings based on optimizations. They can be useful; the LLVM optimizer is better at finding undefined behavior than than clang static analyzer. But the LLVM optimizer is not a static analysis tool; emitting warnings leads to issues when we break someone's -Werror build, or emit a false positive, or emit a warning which technically correct but completely incomprehensible to anyone who isn't a compiler developer.
Thanks for all the comments, I learn something new every day. I had always thought that llc ran the same passes as opt or clang. I'll keep this in mind in the future while trying to repro things like this. I still don't believe LLC should be producing this incorrect code though.
I do agree on Eli Friedman's point that Clang isn't an OPT diagnostic tool. Visual Studio and other compilers don't complain about this either, that's a job for debugging and some things picked up by static analyzers compile into code that works fine.
So yes, I'd agree that the LLC production of a NOP'd function should be looked into but not much further. LLC isn't used often in actual compilations so it should be low priority.
What I eventually did for this, was to emit 'abort' when the function contains no instructions, so if called, instead of random code being executed, it immediately terminates and they can use normal debugging tools to figure out why.
I took your advice, and have not produced a compile-time diagnostic.
Thanks, MartinO