Open Eisenwave opened 3 months ago
FWIW this feels like yet another big fish and could possibly be blocked by https://github.com/cplusplus/papers/issues/592
It's also worth noting that the authors of P1839 disagree on whether the following (related) example is well-defined:
void increment_first_byte(int* n) {
auto* a = reinterpret_cast<char*>(n);
++(*a);
}
If we had clearly defined semantics for when an object is modified through a different type, I'm inclined to agree with Krystian that this is okay.
However, as things stand, I don't believe we have any semantics for this case, and that's a defect that extends beyond the scope of P1835 because it's not about object representations in the case of signed
/unsigned
.
Surely you meant P1839 (there doesn't seem to be a published P1835).
The language lawyer interpretation of the read case is that you get the stored value of the object, which, in this case, is 0. If the value is negative, then http://eel.is/c++draft/expr.pre#4 applies, and it's UB. I think this is wrong, and we ought to say that you get the value with the same value representation. I don't understand the rationale for "unspecified conversion". To me, that's the same as saying you get an unspecified value in all cases.
In the write case, current wording similarly appears to imply that, if the value being written is also a valid value of the actual type of the existing object, then that value is stored. If the value being written is not a valid value of that type, then it's UB by omission. Again, I think we ought to say that you just get the value with the same value representation.
Your interpretation "0u is not a valid value for int
" contradicts current wording such as http://eel.is/c++draft/basic.fundamental#6 "Further, each value has the same representation in both types" which implies that two objects of different type can actually have a common value.
Again, I think we ought to say that you just get the value with the same value representation.
I'm tiptoeing around such a solution because while it makes sense for signed/unsigned integers, it needs to work for accessing int
through char
glvalues as well, and those don't necessarily have the same object representation.
I'm just getting us from UB by omission to unspecified. Not sure with what wording this could really be fixed up, if any. Perhaps this needs a paper.
Your interpretation "0u is not a valid value for int" contradicts current wording such as http://eel.is/c++draft/basic.fundamental#6 "Further, each value has the same representation in both types" which implies that two objects of different type can actually have a common value.
Hmm yeah, fair point. I think the current wording doesn't strictly attach a type to a value for fundamental types, but it's more mathematical. 0
is literally the same value no matter which integral type or floating-point type stores zero. Negative numbers are a good example of where it breaks though; I'll update the issue.
There's also a pretty good chance that this is NAD, everything is working as intended, and you are not permitted to type pun int
as unsigned
except for where the value is the same.
However, if so, we should have some examples that demonstrate that, such as:
int p = 0;
auto up = *reinterpret_cast<unsigned*>(&p); // OK, up has value 0
int n = -1;
auto un = *reinterpret_cast<unsigned*>(&n); // undefined behavior
The "char, unsigned char, std::byte" case in [basic.lval] p11 is the object representation case. Let's not try to fix this here, but defer to P1839.
For the signed/unsigned case, I think that's clearly an (isolated) hole in the specification: We say access is allowed, but we don't say what the result is. For sure after we moved to a mandatory two's complement representation, it seems abundantly clear that you get the congruent value per [basic.fundamental] p3 and p5. Nothing unspecified here.
Wording suggestions welcome.
I guess the following resolution may be wanted. Given that the static and dynamic types of a scalar object is always the same, perhaps we shouldn't say "dynamic type" in [conv.lval] p3.4 and [expr.ass] p2.
Let's not try to fix this here, but defer to P1839.
I guess this is OK, but it's somehow weird to me to just skip the object representation parts.
Modify [conv.lval] as indicated
(3.4) - Otherwise, let
U
andV
be the cv-unqualified version ofT
and the type of the object, repectively. ~t~The object indicated by the glvalue is read ([defns.access]), and the prvalue result is
- the value contained in the object ~is the prvalue result~ if
U
is same asV
,- otherwise, the value whose value representation ([basic.types.general]) is same as the value contained in the object if
U
is a signed or unsigned version ofV
,- otherwise, the value of the first element of the object representation of the object.
If the result is an erroneous value ([basic.indet]) and the bits in the value representation are not valid for the object's type, the behavior is undefined.
Modify [expr.ass] as indicated
- In simple assignment (
=
), letU
andV
be the cv-unqualified version of the type of the left operand and the object referred to by it, repectively, and letx
be the result of the right operand. ~t~The object referred to by the left operand is modified ([defns.access]). ~by replacing its value with the result of the right operand.~ IfU
is same asV
, the value of the object is replaced withx
. Otherwise, ifU
is a signed or unsigned version ofV
, the value of the object is replaced with theV
value whose value representation ([basic.types.general]) is same asx
. Otherwise, the first element of the object representation of the object is replaced withx
.
The "char, unsigned char, std::byte" case in [basic.lval] p11 is the object representation case. Let's not try to fix this here, but defer to P1839.
It seems that P1839 has completely stopped though previously I think it's already accepted as DR in C++23. See https://github.com/cplusplus/papers/issues/592. So what happened?
The "char, unsigned char, std::byte" case in [basic.lval] p11 is the object representation case. Let's not try to fix this here, but defer to P1839.
It seems that P1839 has completely stopped though previously I think it's already accepted as DR in C++23. See cplusplus/papers#592. So what happened?
This is not really the right forum to discuss this, but basically the paper was reviewed by CWG, and certain issues were found that have no known resolution, which are described in the last section of the paper.
CWG2901
Reference (section label): [basic.lval]
Issue description
An object of type
int
is type-accessible through a glvalue of typeunsigned
.However, [conv.lval] p3.4 states:
We obviously require a prvalue of
unsigned
type for the initialization ofy
, and the-1
value contained in theint
object has the wrong type and and cannot be represented asunsigned
.Similarly, [expr.ass] p2 states:
This cannot be done because it would require storing a
UINT_MAX
value ofunsigned
type in an object of typeint
.Suggested resolution
In [basic.lval] p11, add the following sentence: