Open Quuxplusone opened 6 years ago
Bugzilla Link | PR37682 |
Status | NEW |
Importance | P enhancement |
Reported by | Greg Bedwell (greg.bedwell@sony.com) |
Reported on | 2018-06-04 14:12:17 -0700 |
Last modified on | 2019-11-18 14:18:50 -0800 |
Version | trunk |
Hardware | PC Windows NT |
CC | anastasis.gramm2@gmail.com, aprantl@apple.com, dblaikie@gmail.com, ditaliano@apple.com, ftee@flametop.co.uk, jeremy.morse.llvm@gmail.com, llvm-bugs@lists.llvm.org, paul_robinson@playstation.sony.com, vsk@apple.com |
Fixed by commit(s) | |
Attachments | |
Blocks | |
Blocked by | |
See also | PR38777 |
I don't think it's feasible to assert that the debugger show updated values for 'total' within this loop, because it amounts to a demand that LICM never happen. As applied to this loop, the whole point of LICM is to sink the store.
Even though "total" comes in as a reference I think it would be fair to set the location of total to the register throughout the loop.
I think this would be misleading. Inspecting "total" should show the user the value pointed-to by the "total" pointer. This is the only way print a correct value when "total" is aliased.
At the very least I'd expect to see a "variable not visible due to optimization" type message.
This might create scenarios where the variable is never presented as available, even though the debugger is perfectly able to inspect it, because hoisting/sinking occurs frequently.
I also think this would be misleading, because users expect any inspection of "total" to show the value pointed-to by the "total" pointer.
That seems eminently reasonable so I think we'll have to resolve this as it is, but it still leaves me uneasy. Putting myself in the shoes of some user who has written this loop and is trying to debug it, but doesn't know/care anything about compiler optimizations, the debugger is just plain wrong.
This is an unfortunate case where we're comparing pretty unfavourably with the Microsoft compiler in terms of quality of the optimized debugging experience. Looking into it more closely it seems the reason that they do so well is because they have a store inside the loop (tested with VS2017 at /Ox).
It's a shame that there's no way in DWARF (... I assume ... paging Paul ;) ) to represent this sort of situation where we know where the value according to the semantics of the user code being stepped through, but that it has not yet been stored to the location of the pointer. Feature request for a future standard?
I'll put this in the bucket of things to assess the impact of disabling for -Og mode. For reference, GCC 5.4.0 has the same -O2 behaviour as clang here, but the value of total is visible on every iteration with -Og.
> It's a shame that there's no way in DWARF (... I assume ... paging Paul ;) )
to represent this sort of situation where we know where the value according to
the semantics of the user code being stepped through, but that it has not yet
been stored to the location of the pointer.
Drive-by comment without having read the context: DWARF has
DW_OP_implicit_pointer:
> An optimizing compiler may eliminate a pointer, while still retaining the
value that
> the pointer addressed. DW_OP_implicit_pointer allows a producer to describe
this
> value.
> The DW_OP_implicit_pointer operation specifies that the object is a pointer
> that cannot be represented as a real pointer, even though the value it would
> point to can be described.
> Within the loop, variable 'total' now lives in $r10 and is only stored
> back to the address in $rdx after the loop terminates.
That's not really stating the situation accurately. 'total' is a reference,
not an int; it's just that syntactically there's no difference within the
loop. If you declared 'total' as 'int *' I don't think the confusion would
occur.
The *value* of the pointed-to variable is being computed in the loop, and
not actually stored until it terminates. There are lots of cases where we
might be computing something and not actually update a variable until later.
As Adrian points out, in DWARF v5 we could describe this more accurately.
In earlier versions, we cannot, because there's no way to describe a
pointer to a register.
FTR, I am sympathetic to the idea that '-Og' should avoid code-motion
optimizations. Not just for cases like this, but because it messes with
single-stepping something awful.
(In reply to Paul Robinson from comment #4)
> > Within the loop, variable 'total' now lives in $r10 and is only stored
> > back to the address in $rdx after the loop terminates.
>
> That's not really stating the situation accurately. 'total' is a reference,
> not an int; it's just that syntactically there's no difference within the
> loop. If you declared 'total' as 'int *' I don't think the confusion would
> occur.
>
Right. I'm trying to come at this problem by putting aside all of my knowledge
of compiler internals and just think about the experience of the average user
who doesn't care to drop down to stepping through disassembly and just expects
source level stepping to work. In this case, from the user point of view I can
happily step through the loop and see some of my variables being updated, but
not this one in particular. With my other compiler for my other platform I can
see all my variables being updated on each iteration. Therefore I prefer to
debug on that other platform.
I guess my question at this stage is whether:
a) would it be desirable
b) would it be feasible
to have the debugger report something like "the actual value of the reference
is 0, but I've currently got a value of 7 sitting in a register waiting to be
stored there"?
(In reply to Greg Bedwell from comment #5)
> I guess my question at this stage is whether:
> a) would it be desirable
> b) would it be feasible
> to have the debugger report something like "the actual value of the
> reference is 0, but I've currently got a value of 7 sitting in a register
> waiting to be stored there"?
It would obviously improve the debugging experience in this kind of situation,
but it would take more expressive power than DWARF has to describe it.
(In reply to comment #1)
> > Even though "total" comes in as a reference I think it would be fair to
> > set the location of total to the register throughout the loop.
>
> I think this would be misleading. Inspecting "total" should show the user
> the value pointed-to by the "total" pointer. This is the only way print a
> correct value when "total" is aliased.
>
(In reply to comment #1)
> I also think this would be misleading, because users expect any inspection
> of "total" to show the value pointed-to by the "total" pointer.
(In reply to comment #4)
> That's not really stating the situation accurately. 'total' is a
> reference, not an int; it's just that syntactically there's no difference
> within the loop. If you declared 'total' as 'int *' I don't think the
> confusion would occur.
I've been thinking about this recently, and the more I think about it, the more
I'm not sure that I agree with this entirely. My concern comes down to that
fact that regardless of the underlying compiler implementation it's a
reference, not a pointer:
"It is neither a pointer to the object, nor a copy of the object. It is the
object." - https://isocpp.org/wiki/faq/references#overview-refs
So, from a pure user perspective, I still think that there's an argument to be
made that for a few instructions within the loop the location of 'total' (and
any aliases of it) is a register, not an address. If total had come into the
function as int* I'd agree entirely with the above points (but still arguing in
favour of a way to represent 'pending' values to the debugger).
It may well be that we have to accept this as a limitation of the LLVM
representation, but I'm leaning towards disagreeing with the assessment that it
would be incorrect behaviour to temporarily bind the debug location of total to
the register, as opposed to reasonable behaviour that may be not feasible to
properly model.
Hmmm.
If 'total' was a global 'int' variable instead of an 'int&' parameter,
and the generated code was effectively the same (e.g. when compiled as
PIC), I think there would be less disagreement about the appropriateness
of identifying the location as in the register, for the course of those
few instructions.
So, I'm becoming more convinced that Greg has a point, and the problem
is mainly that we can't describe it correctly prior to DWARF v5.