cplusplus / CWG

Core Working Group
24 stars 7 forks source link

CWG1953 [intro.memory] Unsequenced accesses within the same storage are not undefined behavior when different scalar objects are used (unsequenced union access) #621

Open Eisenwave opened 1 week ago

Eisenwave commented 1 week ago

Reference (section label): [intro.memory]

Issue description

union U { int x, y; } u;
(u.x = 1, 0) + (u.y = 2, 0);

The latter statement makes two unsequenced modifications which target the same storage within u but are not the same memory location by definition. Therefore, the statement is well-defined, but it should not be.

Suggested resolution

Update [intro.memory] paragraph 3 as follows:

A memory location is a set of elements which occupy overlapping storage, where each element is either an object of scalar type that is not a bit-field or a maximal sequence of adjacent bit-fields all having nonzero width.
languagelawyer commented 1 week ago

A memory location is a set of elements which occupy overlapping storage, where each element is either an object of scalar type

What is the point of this change? Out-of-lifetime objects do not occupy storage anyway

Eisenwave commented 1 week ago

@languagelawyer the point is that it would be nonsensical to support this. The example above is essentially a data race; why should we suddenly require that to be well-defined just because the user is writing to two union members in the same storage instead of a single object?

Remember that unsequenced operations in the abstract machine may happen simultaneously, not just in an unspecified order. Modifying two union members simultaneously would ignore implementation issues like torn writes and probably quite a lot of optimizer assumptions.

Only one union member being active doesn't make it any more implementable to write to two members simultaneously and expect a well-defined result. Note that assigning an inactive member begins the lifetime and thus the storage has to be there, and since both assignments happen simultaneously, there simultaneously needs to be storage for x and y.

Or, another thought: if the change is pointless, then is x or y active after the latter statement? Since the assignments are unsequenced, I claim that ... uh ... I guess both live at the same time now? It doesn't really work, at least not if you change one to float or something.

Eisenwave commented 1 week ago

Out-of-lifetime objects do not occupy storage anyway

Also to elaborate on this and what happens in the example, since x and y are not within their lifetime initially, both (u.x = 1, 0) and (u.y = 2, 0) would simultaneously (because the expressions are unsequenced), implicitly begin the lifetimes of x and y ([class.union.general] p5).

This contradicts [intro.object] p10, which requires that x and y do not overlap. It could be argued that the example is already undefined behavior by simply pointing to this "bad stuff can't happen" rule, but that seems like a pretty crazy way for us to express that, especially given that it's quite logically complex:

frederick-vs-ja commented 1 week ago

It's more confusing to me to add "a set of elements which occupy overlapping storage". I don't think this makes sense when two memory regions inexactly overlap.

Looking at [intro.execution] p10 (emphasis mine):

[...] If a side effect on a memory location ([intro.memory]) is unsequenced relative to either another side effect on the same memory location or a value computation using the value of any object in the same memory location, and they are not potentially concurrent ([intro.multithread]), the behavior is undefined. [...]

I think it would make more sense to say "the same memory location or two overlapping memory locations" instead.

frederick-vs-ja commented 1 week ago

I feel that this issue partially overlaps with CWG1953. Perhaps we should augment CWG1953 to handle this. @jensmaurer

The example above is essentially a data race;

It is not per [intro.races] p21. It's unclear to me why signal handling is considered able to cause in-thread data race, which was decided in N3910 (resolving CWG1441). I don't think we should add a new kind of in-thread data race.

The intent for the UB specified in [intro.execution] p10 is not very clear to me, either. It's definitely practical to treat such unsequenced operations in the same way as indeterminately sequenced operations. I guess the "real" intent is allowing compilers to infer that if there're two such unsequenced operations, then the two affected memory locations don't overlap.

Since the assignments are unsequenced, I claim that ... uh ... I guess both live at the same time now?

This can't be correct due to the nature of union. Some operation should be lifetime-ending, but we can hardly tell which one is. Given the beginning and ending of lifetime are unsequenced, perhaps there should be a kind of UB like that specified in [intro.execution] p10. CWG2863 is possibly related but the current proposed resolution isn't helpful.

Eisenwave commented 1 week ago

@frederick-vs-ja

I think it would make more sense to say "the same memory location or two overlapping memory locations" instead.

Yeah, it does seem a bit better to target the problem in [intro.execution] instead of [intro.memory].

The intent for the UB specified in [intro.execution] p10 is not very clear to me, either. It's definitely practical to treat such unsequenced operations in the same way as indeterminately sequenced operations. I guess the "real" intent is allowing compilers to infer that if there're two such unsequenced operations, then the two affected memory locations don't overlap.

That's the intent from what I can see as well. It would break some fundamental assumptions such as if you have (x + 2) + (x + 2), this could be simplified to (x * 2 + 4). If x can be modified between these expressions (through modification of another union member of the same type (see [class.mem.general] p28), the optimization may be illegal, depending on when this indeterminately sequenced write to x actually takes place.

In essence, doing an unsequenced modification through a union member of the same type cheats the system and bypasses the usual restriction that you can't read/write or write/write an object simultaneously.

It's unclear to me why signal handling is considered able to cause in-thread data race [...]

Because signals may be raised in a way that tears writes. Consider a 32-bit architecture which modifies a long long using two separate mov instructions. When an interrupt occurs between these two instructions, and if the signal handler modifies that long long, you end up with two interleaved modifications to the same object, causing the same kind of gibberish value that you may see in a multi-threaded data race:

unsigned long long x;
x = 0xff'ff'ff'ff'ff'ff'ff'ff; // signal raised while this is done
// ...
void signal_handler() { x = 0; }

In this example, x may have the value 0xff'ff'ff'ff'00'00'00'00 after signal_handler exits.

Allowing multiple unsequenced modifications to the same memory (via two union members) essentially comes with the same problem.

jensmaurer commented 5 days ago

We have two rules for conflicting accesses: One in [intro.execution] p10 for the strictly sequential case (single expression evaluation) and one in [intro.races] p21 (with the definition of "conflict" from [intro.races] p2) for the parallel and signal handler cases.

For both CWG1953 and the implicit object creation during union member assignment, we want the sequential and the parallel situation to be undefined behavior. Also, in general, we want undefined behavior to be explicit rather than the absence of specification.

That means a change to the definition of "memory location" seems in order. Maybe talking about the storage comprising the value representation of the objects in question would help.