cplusplus / CWG

Core Working Group
23 stars 7 forks source link

[basic.memobj] incoherent rules regarding the arrays containing out-of-life elements #416

Open sergey-anisimov-dev opened 11 months ago

sergey-anisimov-dev commented 11 months ago

Full name of submitter (unless configured in github; will be published with the issue): Sergey Anisimov

Reference (section label): [basic.memobj]

Issue description: If a portion of a storage that an array (a) of unsigned char or std::byte occupies is reused by another object (o), a is said to provide storage for o. In other words, the regions of storage they simultaneously occupy may overlap. However, such does not appear to be the case for the elements of a: while a is said to provide storage for o (and, consequently, o is nested within a), [to my knowledge] no additional rules are supplied to allow the elements of a and o to coexist, thus it seems reasonable to assume that, in accordance with [basic.life#1.5] and [intro.object#9], the lifetime of all of the affected elements simply ends. Having that in mind, consider the example that currently supplements one of the aforementioned clauses (recomposed for clarity):

template<typename ...T>
struct AlignedUnion {
  alignas(T...) unsigned char data[max(sizeof(T)...)];
};

void f() {
  AlignedUnion<int, char> au;
  new (au.data) char() /* #1 */;
  new (au.data /* #2 */ + /* #3 */ 1) char();
}

At #1 an object of type char gets nested within the array under au.data. Array-to-pointer conversion invoked by supplying au.data as a place-argument to the new-expression yields a pointer to the first unsigned char subobject (for the sake of brevity, let's call it c). As per the reasoning above, the array itself continues to live, however c dies here: its storage has just been reused. Thus, at #2 the very same array-to-pointer conversion doesn't yield a pointer to c, but at most (since the storage has actually been already reused: this might also be an invalid pointer value altogether) a pointer to the region of storage * c used to occupy during its lifetime instead. In turn, an operand this pointer represents is immediately supplied to an addition expression, which expects a pointer to an array element (which said operand no longer constitutes), thus turning #3 into an undefined operation.

* The definition for the array-to-pointer conversion unconditionally states that a pointer to the first element is returned (the element being the corresponding array subobject). This issue is written under the assumption that it is a stand-alone oversight, akin to the one with "null glvalues" (and thus requires clarifications on its own). Another family of conditions inherently exists, under which an array is alive while its elements may very well not be: implicit object creation. All arrays are of implicit-lifetime types, while the same is not necessarily true for their elements (a note exists to accentuate this very possibility). Last but not least, an array element can be destroyed by an explicit (pseudo-)destructor invocation. In all cases, this would constitute a direct contradiction with [basic.life#6] and serve as yet another example of malignant interchangeable use of the notion of objects themselves and the means of referring to them.

Apart from the pointer arithmetics, out-of-life elements may make deletions of dynamic arrays containing them impossible to manage, as is already shown in #415.

jensmaurer commented 11 months ago

For the provides-storage cases, it seems we should consider the array to be the object representation. P1839 wants to clean up in that area.

languagelawyer commented 11 months ago

at #2 the very same array-to-pointer conversion doesn't yield a pointer to c

This is not true

This issue is written under the assumption that it is a stand-alone oversight

The assumption is wrong. Array-to-pointer conversion yields a pointer to the first element whatever its lifetime is. Even whatever the lifetime of the array itself is.

sergey-anisimov-dev commented 11 months ago

Array-to-pointer conversion yields a pointer to the first element whatever its lifetime is.

Well, it's exactly the contradiction this issue is raised about. How can there be a pointer to something which no longer exists? Are you implying that objects proceed to normatively exist in an "out-of-life" state indefinitely, even after their storage was reused, and it's possible to observe their identity in such a state, @languagelawyer? Because I'm almost certain that this will break in a million other places. In particular, wouldn't this directly contradict [basic.life#4] as well?

languagelawyer commented 11 months ago

How can there be a pointer to something which no longer exists?

It exists, just not within its lifetime.

Are you implying that objects proceed to normatively exist in an "out-of-life" state indefinitely, even after their storage was reused

Why not?

Because I'm almost certain that this will break in a million other places.

Not sure

In particular, wouldn't this directly contradict [basic.life#4] as well?

How «existing» can be a property which is valid only during the lifetime if an object is created (become exiting?) not alive?

(Not sure what [basic.life]/4 even mean, I'd say it can be removed with no effect)

sergey-anisimov-dev commented 11 months ago

First of all, I'd say it's noteworthy that storage reuses or switching over active union members would continue to stack up objects on top of each other indefinitely under this reasoning, given I understood it correctly. E.g.

int i;
new(&i) int;
/* ... */

union
{ int i, j; } u;
u.j = 42;
/* ... */

would introduce one additional object per each instance of storage reuse performed, as I have no recollection of a mechanism that purges ("rogue") objects from existence entirely (currently we just cease concerning ourselves with them, since those (... almost?) can't be observed). Afaik, the only place this is required to hold presently (disregarding the context of this discussion) is [intro.object#2], on the matter of subobject registration (which is, I'd argue, contradictive on it's own and is too in need of a revision, but that is a topic for another time, I guess).

So, let's just proceed with it and assume that the objects do actually pile up: the decay should still function correctly. Should it, though? Consider the following snippet:

int a[42] /* #1 */;
new(a) int /* #2 */;
/* #3 */;

At #1 an array object is created and brought into life by the means of definition; same for all of its element subobjects. Let's designate the element subobject referred to by a[0] at this point as o1. At #2 the storage under o1 gets reused in order to emplace a new object. As per [intro.object#2], this new object (o2) becomes an array element [0] subobject for the array under a without neither destroying o1 itself, nor eliminating its subobject relation with the array in question, both of which we need, in order to make the pointer arithmetics to work later (since we chose this line of reasoning). So, naturally, at this point there exist two element subobjects at position [0], granted, only one of which (o2) is within its lifetime. But, as we previously assumed, lifetimes are of no significance in this regard, thus begs itself a question: "What a would decay into at #3?" =)

Personally, I doubt that this constitutes even remotely sane approach to formalizing the intention: these "rogue" objects are attempted solely for the purpose of preserving the original wording, which imho warped the whole situation backwards ("expression -> intent" instead of "intent -> expression") and interferes with the reader's attempts at understanding it (this is not the first discussion on the matter I'm reading/participating in; all the previous ones were inconclusive, this one seems to go in the very same direction). What might be found useful at times such as these, when there exists a specification of a system without any formal verification, is at least an attempt to imagine this system (that would be the abstract machine in this case) expressed in code, say in the very same C++, as an ordinary programming project. I, for example, experience near-revulsion when presented with a thought of allowing the resources at the very core of the project to uncontrollably leak simply in order for the code to even compile. I find fundamental C++ concepts to be quite mature and strong in general, and it hurts beyond words to see them wasted on some shallow execution. The specification is here to play a certain part - I would very much like to respect its performance and use it to the fullest, but instead, regretfully, quite often I find myself stumbling over rough edges such as, I'd argue, the one in question. What's the point in having an elaborate and rigorous schematics of a complex system, if one can't trace those to constituent atoms that make sense?..

@languagelawyer, this last part by no means was intended to be offensively directed at you or anything of that nature. Just a (somewhat emotional, perhaps) attempt to share the perspective and ask, whether such "minor issues" are even considered as significant, or if some "intuitive layer" is to be expected/required from the readers, enough for the questions such as these not to pose real normative problems, I guess?

frederick-vs-ja commented 11 months ago

But, as we previously assumed, lifetimes are of no significance in this regard, thus begs itself a question: "What a would decay into at #3?" =)

I think the rule for transparent replaceability ([basic.life] p8) can make such ambiguity ignorable - we can say a still decays into "the pointer to o1", but the result is immediately transformed to "the pointer to o2".

a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object

languagelawyer commented 11 months ago

@sergey-anisimov-dev I know that it is not specified which pointer to which first element is produced

languagelawyer commented 11 months ago

I think the rule for transparent replaceability ([basic.life] p8) can make such ambiguity ignorable - we can say a still decays into "the pointer to o1", but the result is immediately transformed to "the pointer to o2".

I've thought the rule speaks about existing pointers, i.e. objects containing the pointer values. The values are replaced by new ones.

sergey-anisimov-dev commented 11 months ago

I think the rule for transparent replaceability

Transparent replaceability only kicks in when a complete object is being replaced which is not the case here.

I know that it is not specified which pointer to which first element is produced

Then it's still an issue, I guess.

frederick-vs-ja commented 11 months ago

I think the rule for transparent replaceability

Transparent replaceability only kicks in when a complete object is being replaced which is not the case here.

Oh, I haven't realized that CWG2677 is still open.

sergey-anisimov-dev commented 11 months ago

Oh, I haven't realized that CWG2677 is still open.

Again, I think it would be better not to rely on this "object stacking" altogether (and remove the possibility of it happening completely): imho it's very counter-intuitive and excessive, looks like an oversight rather than a meaningful intention expression.

Just imagine, what is intended to be the norm here: objects stack on top of each other indefinitely and are needed in this manner solely for the purpose of plugging some wording loopholes and when that actually happens, they "work" only by interfacing with more loopholes. In particular, is it really the intention, that replacing the array elements stacks up those "leaking"/"rogue" elements and the array-to-pointer works (or rather "will perhaps work") because of whichever actual variant of those the array handle decays into, it would "correct" itself into the alive one due to a particular obscure transparent replaceability instance, given an alive one actually exists, and if it doesn't then it doesn't matter anyways.

Are there some other sources regarding the intended requirements for pointer arithmetics/comparison? Perhaps some discussions reflected the original intention better in this instance?