llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.17k stars 12.03k forks source link

[Diagnostic request] Lifetime of temporaries in range-based for loop #31646

Open theres-waldo opened 7 years ago

theres-waldo commented 7 years ago
Bugzilla Link 32299
Version unspecified
OS All
CC @chandlerc,@dwblaikie,@DougGregor,@hfinkel,@zygoloid

Extended Description

Core Issue #​1498 [1] / EWG Issue #​120 [2] concerns the treatment of temporaries in the range expression of a range-based for loop.

An example, taken from the EWG issue, is:

std::vector vec; for (int val : vec | boost::adaptors::reversed | boost::adaptors::uniqued) { // Do stuff with val }

Here, while the temporary

vec | boost::adaptors::reversed | boost::adaptors::uniqued

has its lifetime extended by the reference it's bound to (specified in the lowering of the range-based for loop), the intermediate temporary

vec | boost::adaptors::reversed

does not, resulting in references to it inside the retained temporary becoming dangling, and the code exhibiting undefined behaviour.

One of the proposed resolutions for this issue is to extend the lifetimes of all temporaries that appear in the range expression for the duration of the loop.

While there has been no consensus for this (or any other) resolution so far, the issue was discussed at the recent WG21 meeting in Kona, and Chandler Carruth suggested that clang could be helpful and issue a diagnostic in cases where it would extend the lifetime of additional temporaries if the resolution were adopted, to warn users about the potential lifetime error.

I would like to request that such a diagnostic be implemented. I believe that the current behaviour (silent lifetime error leading to undefined behaviour) is a significant "gotcha" for users of range-based for loops, and a diagnostic along these lines would be very helpful.

[1] http://open-std.org/JTC1/SC22/WG21/docs/cwg_closed.html#1498 [2] http://cplusplus.github.io/EWG/ewg-active.html#120

dwblaikie commented 7 years ago

To try to restate, my point was that we could warn for every temporary created in a range for which doesn't have its lifetime extended. These are the set of temporaries whose lifetime would change with the proposed change to the language. We might restrict this to temporaries with non-trivial destructors or some such where the lifetime change is quite noticable.

I have no reason to believe this changes Dave's assessment though. The intent was not to provide a useful warning to find bugs in code, but to indicate that compilers could inform users about possible semantic changes in upcoming standards by detecting the condition that would change.

OK - thanks for reiterating :)

So it seems like this isn't probably a warning that should be shipped unless the language feature is accepted (or is it already accepted) - but implemented out of tree & some analysis/data gathering done with it to inform the language design?

chandlerc commented 7 years ago

This seems like it'd have a pretty high false-positive rate. Perhaps Chandler had some other details in mind that aren't captured here

That's possible. Chandler, my apologies if my report didn't capture your suggestion accurately. Could you clarify what you had in mind?

I'm not sure, but I think so?

To try to restate, my point was that we could warn for every temporary created in a range for which doesn't have its lifetime extended. These are the set of temporaries whose lifetime would change with the proposed change to the language. We might restrict this to temporaries with non-trivial destructors or some such where the lifetime change is quite noticable.

I have no reason to believe this changes Dave's assessment though. The intent was not to provide a useful warning to find bugs in code, but to indicate that compilers could inform users about possible semantic changes in upcoming standards by detecting the condition that would change.

I actually think the observation that this would fire for a lot of innocuous code speaks to why this semantic change may not be the correct direction for the standard: do we really want every temporary to have its lifetime extended in this way?

I increasingly think something like giving the user a tool to help do this themselves would be better, along the lines of what Thomas has proposed:

for (auto var = GetTemporaryVar(GetWombat()); auto x : var.method()) { ... }

This gives the programmer the tools to address the problem but in a more precise way.

theres-waldo commented 7 years ago

This seems like it'd have a pretty high false-positive rate. Perhaps Chandler had some other details in mind that aren't captured here

That's possible. Chandler, my apologies if my report didn't capture your suggestion accurately. Could you clarify what you had in mind?

hfinkel commented 7 years ago

This seems like it'd have a pretty high false-positive rate. Perhaps Chandler had some other details in mind that aren't captured here - but the notion of warning on anything used by reference in a range-for expression being warning worthy doesn't seem like it'd be good to me. (maybe implementation experience will show otherwise - this is only my best/first guess)

Lots of functions take parameters by reference that they don't include references to in their output... but maybe few enough are used in range-for loops as to be worth a second look? Not sure.

I share this concern. We might experiment with an attribute?

dwblaikie commented 7 years ago

This seems like it'd have a pretty high false-positive rate. Perhaps Chandler had some other details in mind that aren't captured here - but the notion of warning on anything used by reference in a range-for expression being warning worthy doesn't seem like it'd be good to me. (maybe implementation experience will show otherwise - this is only my best/first guess)

Lots of functions take parameters by reference that they don't include references to in their output... but maybe few enough are used in range-for loops as to be worth a second look? Not sure.

hfinkel commented 7 years ago

Is there any reason this should be specific to for loops? Doesn't this also come up for reference function arguments and other places where we lifetime-extend temporaries but the expression used to form those temporaries involve other temporaries that aren't lifetime extended?