cplusplus / CWG

Core Working Group
23 stars 7 forks source link

CWG2759 [class.mem] [[no_unique_address]] in common initial sequence (layout and ignorability) #356

Closed leni536 closed 9 months ago

leni536 commented 1 year ago

Full name of submitter: Lénárd Szolnoki

Reference: [class.mem]

The common initial sequence of two standard-layout struct ([class.prop]) types is the longest sequence of non-static data members and bit-fields in declaration order, starting with the first such entity in each of the structs, such that ...

  • either both entities are declared with the no_unique_address attribute ([dcl.attr.nouniqueaddr]) or neither is, ...

There are two issues with the referenced wording.

Layout issue

Description

Itanium ABI does not conform to the definition of common initial sequence. In various cases it puts corresponding members at different offsets, which makes the behavior of read access through inactive member non-conforming.

Examples

  1. Empty members of same and distinct types:
struct A {};
struct B {};

struct C {
  [[no_unique_address]] A a;
  [[no_unique_address]] B b;
};

struct D {
  [[no_unique_address]] A a1;
  [[no_unique_address]] A a2;
};

static_assert(offsetof(C, b) == offsetof(D, a2));

Both gcc and clang fail the static_assert, b is at offset 0 in C, a2 as at offset 1 in D:

https://godbolt.org/z/YTdbPc4Ez

  1. Overlapping on tail padding and POD for the purpose of layout:
class A1 {
    private:
    int i;
    char ch;
};

class B1 {
    public:
    [[no_unique_address]] A1 a;
    char ch;
};

class A2 {
    public:
    int i;
    char ch;
};

class B2 {
    public:
    [[no_unique_address]] A2 a;
    char ch;
};

static_assert(offsetof(B1, ch) != offsetof(B2, ch));

This is also reported for the Itanium ABI at https://github.com/itanium-cxx-abi/cxx-abi/issues/108 .

The issue here is the A2 is "POD for the purpose of layout" (Itanium ABI term), while A1 is not. Objects of type A2 are laid out so they can not overlap with other objects, even if they are potentially overlapping.

Suggested resolution

While the second example is rather specific to the Itanium ABI, in my opinion the first example can't be made conforming in a satisfactory way that actually makes use of the attribute. In particular it's desirable that sizeof(C) == 1 but sizeof(D) has to be at least 2.

My suggestion is to not allow members declared with [[no_unique_address]] to be part of a common initial sequence at all. I think this use case is not well-motivated and the common initial sequence rule is mostly there for C compatibility. As C structs can't have [[no_unique_address]] members, disallowing them in common initial sequence does not hurt C compatibility.

Ignorability issue

Description

The wording only considers if a member is syntactically declared with or without a [[no_unique_address]] attribute, without considering whether that attribute is implemented or not.

MSVC does not implement the attribute and considers the following two classes layout compatible, which is observable through the is_layout_compatible type trait:

struct E {};

struct A {
    E e;
    int i;
};

struct B {
    [[no_unique_address]] E e;
    int i;
};

static_assert(
    std::is_layout_compatible_v<A, B>
);

https://godbolt.org/z/PqGjEqGPW

Suggested resolution

[[no_unique_address]] should only be considered for common initial sequence when the attribute is not ignored.

Suggested wording

My combined suggested wording based on the suggested resolution of both issues:

The common initial sequence of two standard-layout struct ([class.prop]) types is the longest sequence of non-static data members and bit-fields in declaration order, starting with the first such entity in each of the structs, such that ...

  • either both entities are declared with the no_unique_address attribute ([dcl.attr.nouniqueaddr]) or neither is, ...
  • if has-attribute-expression is not 0 for the no_unique_address attribute then neither entity is declared with the no_unique_address attribute, ...
jensmaurer commented 1 year ago

CWG2759