cplusplus / CWG

Core Working Group
23 stars 7 forks source link

[dcl.init.ref] Specify which form of direct initialization is used to direct initialize the reference and also make reference initialization more consistent #596

Open ranaanoop opened 1 month ago

ranaanoop commented 1 month ago

Full name of submitter: Anoop Rana

Reference (section label): [dcl.init.ref]

Link to reflector thread (if any): https://stackoverflow.com/questions/78867403/reference-initialization-using-list-initialization-accepted-by-gcc-and-msvc-but

Issue description:

There are two main issues described here. First is that dcl.init.ref#5.4.1.sentence-2 doesn't specify if the reference is direct initialized using () or using {}. This matters because X&& ref{z}; form is accepted by three compilers while X&& r = z; is rejected by all. The second issue is that we should make X&& r =z; and X&&{z}; be more consistent and ill-formed. So to make the issue clearer, consider the following example: Demo


struct X
{
    public:
    X(){}
};
struct Y : X 
{

};
struct Z
{
    public:
        operator const Y () const
        {
            return {};
        }
};
int main()
{
    Z z;
    // X&& r = z; // #1: All four rejects this
    X&& ref{z}; //#2:   Clang: Nope, gcc: Ok, msvc: Ok, Edg: Ok
}

First we see how/why #1 is almost ill-formed as per the current wording. I say "almost" because [dcl.init.ref#5.4.1] doesn't specify if () or {} form is used for the direct initialization. And this might make a difference as already seen from X&& ref{z}; which is accepted by 3 compilers but not clang.

So to understand #1 we go to dcl.init.ref:

A reference to type “cv1 T1” is initialized by an expression of type “cv2 T2” as follows:

  • [...]
  • [...]
  • [...]
  • Otherwise, T1 shall not be reference-related to T2.
    • If T1 or T2 is a class type, user-defined conversions are considered using the rules for copy-initialization of an object of type “cv1 T1” by user-defined conversion ([dcl.init], [over.match.copy], [over.match.conv]); the program is ill-formed if the corresponding non-reference copy-initialization would be ill-formed. The result of the call to the conversion function, as described for the non-reference copy-initialization, is then used to direct-initialize the reference. For this direct-initialization, user-defined conversions are not considered.

This means that the result of the conversion function which will be of type const Y will be used to direct initialize r.

So the initialization process will be repeated again. This time we first go to direct initialization:

If the destination type is a reference type, see [dcl.init.ref].

So now come to dcl.init.ref and see that none of the bullet points is applicable for initialization of the reference r from const Y.

In particular, 5.1 is not applicable because the reference is not lvalue reference.

Similarly 5.2 is not applicable.

5.3 is also not applicable.

5.4 is not applicable because const Y is reference related to X as X is a base.

So none of the bullet points are applicable for the direct initialization of r by const Y and so X&& r = z; is ill-formed.

The important thing here is that here we didn't go to dcl.init#general-16.1 but instead to dcl.init#general-16.2. But the dcl.init.ref#5.4.1.sentence-2 currently doesn't specify that {} form of direct initialization is not used.

We can solve this first issue by changing "direct initialization" to "direct-non-list-initialization" or directly saying () form is used instead of {}.

Additionally, there might also be another issue(that of consistency) between not allowing X&& r = z; but allowing X&& ref{z};. That should also be fixed(if any) by additional wording changes(apart from what is suggested here).

Suggested Resolution

Change dcl.init.ref#5.4.1.sentence-2 to as highlighted below in bold:

The result of the call to the conversion function, as described for the non-reference copy-initialization, is then used to direct-initialize the reference using direct-non-list-initialization. For this direct-initialization, user-defined conversions are not considered.

t3nsor commented 1 month ago

I retracted my previous comment.

I don't know if we want to set a precedent of saying "direct-non-list-initialization" everywhere we mean that, because there are probably other places that would need to be changed. Instead maybe we should say somewhere that to direct-initialize an object or reference from an expression E means to initialize it as if its initializer were ( E ).

t3nsor commented 1 month ago

For this particular example, I don't know if it matters whether direct-list-initialization is used to initialize the reference in the last step. If you used direct-list-initialization, you would fall through to [dcl.init.list]/3.9 which just tells you to use direct-initialization anyway.

t3nsor commented 1 month ago

How about in this example:

struct S {
    operator double();
};
int&& r{S()};

Here, the choice of whether to use list-initialization for the last step affects whether it's well-formed, because the conversion from double to int would be narrowing in a list-initialization context.

EDG and GCC accept it. Clang and MSVC reject.

jensmaurer commented 1 month ago

This example very much looks like a narrowing situation to me. It should be ill-formed.

ranaanoop commented 1 month ago

@jensmaurer Are you referring to #1 or #2 or both. I think both should be ill-formed. Note that to clarify that #1 is ill-formed we will need to specify which form of direct initialization(( ) or { }) is used even though they might both eventually lead to the same conclusion. The second issue is that the #2 should also be ill-formed(as you said, assuming you were referring to #2) but it is not clear if #2 is ill-formed.

ranaanoop commented 1 month ago

@t3nsor Yes, the other option/alternative is to specify directly the (E) form of direct initialization as mentioned in the original issue without the E.

jensmaurer commented 1 month ago

I'm referring to @t3nsor's latest example:

struct S {
    operator double();
};
int&& r{S()};
ranaanoop commented 1 month ago

@jensmaurer Yes, that is why I included it in the new issue. I agree that it should also be ill-formed.

jensmaurer commented 1 month ago

What does the current wording say about #2 ? Is that well-formed?

ranaanoop commented 3 weeks ago

@jensmaurer #2 which is (X&& ref{z};) seems to be well-formed as explained below.

First X&& r{z}; is list-initialization. So we reach dcl.init.list#3:

List-initialization of an object or reference of type cv T is defined as follows:

  • [..]
  • [...]
  • Otherwise, if T is a reference type, a prvalue is generated. The prvalue initializes its result object by copy-list-initialization from the initializer list. The prvalue is then used to direct-initialize the reference. The type of the prvalue is the type referenced by T, unless T is “reference to array of unknown bound of U”, in which case the type of the prvalue is the type of x in the declaration U x[] H, where H is the initializer list.

So first we need to check if the copy-list-initialization of the result object from the initializer list {z}. Basically, we need to check the validity of X resultObject = {z}; to move further.

So we go to dcl.init#general-16:

The semantics of initializers are as follows. The destination type is the cv-unqualified type of the object or reference being initialized and the source type is the type of the initializer expression. If the initializer is not a single (possibly parenthesized) expression, the source type is not defined.

  • If the initializer is a (non-parenthesized) braced-init-list or is = braced-init-list, the object or reference is list-initialized ([dcl.init.list]).

So we again move to dcl.init.list, this time to check the validity of X resultObject = {z};:

List-initialization of an object or reference of type cv T is defined as follows:

  • [...]
  • [...]
  • Otherwise, if T is a class type, constructors are considered. The applicable constructors are enumerated and the best one is chosen through overload resolution ([over.match], [over.match.list]). If a narrowing conversion (see below) is required to convert any of the arguments, the program is ill-formed.

This means that X's constructors(including the implicit copy ctor etc) will be enumerated for the argument z. Now we move to over.match:

Overload resolution selects the function to call in seven distinct contexts within the language:

  • [...]
  • [...]
  • invocation of a user-defined conversion for copy-initialization of a class object ([over.match.copy]);

Now, we go to over.match#copy:

Under the conditions specified in [dcl.init], as part of a copy-initialization of an object of class type, a user-defined conversion can be invoked to convert an initializer expression to the type of the object being initialized. Overload resolution is used to select the user-defined conversion to be invoked.

Assuming that “cv1 T” is the type of the object being initialized, with T a class type, the candidate functions are selected as follows:

  • [...]
  • When the type of the initializer expression is a class type “cv S”, conversion functions are considered. The permissible types for non-explicit conversion functions are T and any class derived from T. When initializing a temporary object ([class.mem]) to be bound to the first parameter of a constructor where the parameter is of type “reference to cv2 T” and the constructor is called with a single argument in the context of direct-initialization of an object of type “cv3 T”, the permissible types for explicit conversion functions are the same; otherwise there are none.

In both cases, the argument list has one argument, which is the initializer expression.

This means that the conversion function Y::operator const Y () const can be used to convert the initializer expression z to an rvalue which will then be used as argument for one of X's ctors.

The important thing is that only the implicit copy ctor X::X(const X&&) is viable here for the rvalue obtained as a resut of the conversion function.The implicit move ctor X::X(X&&) is not viable because it's parameter is a non-const rvalue reference.


Note that if you change the move ctor's parameter to be const X&&(demo), then both the copy ctor and the move ctor will be viable for the rvalue obtaied from the conversion operator but then move ctor X::X(const X&&) will be a better match than the copy ctor X::X(const X&) because the argument(result of conversion operator) is a rvalue.

jensmaurer commented 3 weeks ago

The cross-reference in dcl.init.list points to [over.match.list], so I'd go there, not to over.match.copy.

ranaanoop commented 3 weeks ago

@jensmaurer We would've to take both [over.match] as well as [over.match.list] into account as dcl.init#list-3.7(which was applied here), cross references both of them(over.match and over.match.list). Btw I only omitted writing [over.match.list] explicitly in my previous comment as even after [over.match.list] we'd still have to take [over.match] into account.

jensmaurer commented 3 weeks ago

The list you quoted from [over.match] is introductory fluff; it should never be used to select a subclause of [over.match]. Again, please use [over.match.list] in your analysis, because that's what the normative intent is. We always use the rules for list-initialization overload resolution when an initializer list is involved.

ranaanoop commented 3 weeks ago

@jensmaurer I've already used over.match.list in my analysis. I've just not written it explicitly. In particular, [over.match.list] says:

When objects of non-aggregate class type T are list-initialized such that [dcl.init.list] specifies that overload resolution is performed according to the rules in this subclause or when forming a list-initialization sequence according to [over.ics.list], overload resolution selects the constructor in two phases:

  • [...]
  • Otherwise, or if no viable initializer-list constructor is found, overload resolution is performed again, where the candidate functions are all the constructors of the class T and the argument list consists of the elements of the initializer list.

Note the emphasis on "the argument list consists of elements of initializer list" which here will mean that the argument list consists of z which is exactly why in my original analysis I wrote:

This means that X's constructors(including the implicit copy ctor etc) will be enumerated for the argument z.

I though it was clear/trivial that now z will be the argument so I omitted quoting that [over.match.list] explicitly but used it in the analysis. Now, it should be clear why z was taken as argument there.

Anyways, I've now added a link there in the original comment so that future readers can see why z was used as argument.

ranaanoop commented 3 weeks ago

@jensmaurer There also seems to be a clang bug reported for this last year. They mention different behavior there for different c++ versions. The example there is almost same as ours #2.

t3nsor commented 3 weeks ago

I think your analysis is correct. All implementations do accept X x = {z}; and so [dcl.init.list]/3.10 effectively tells us to bind the reference to a temporary that is copy-initialized in the same manner as that hypothetical x.

Basically every time you add a surrounding pair of braces, you enable one additional user-defined function to be called. For example X x = {{z}}; could call 3 user-defined conversions in sequence to convert z to X.

With // #1, you only get to call one user-defined function and the reference must bind to the result, so it's ill-formed.

t3nsor commented 3 weeks ago

The interesting thing about // #1 is that it used to be well-formed before CWG1604 changed the rule from "bind to a copy-initialized temporary" to "bind to the result of the conversion". The purpose of CWG1604 was to eliminate an unnecessary additional copy in the era before guaranteed copy elision. I think we could probably make a useful tweak to the rules here: when you call a conversion function that returns a prvalue in [dcl.init.ref]/5.4.1, adjust the prvalue's cv-qualification to match that of the referenced type.

t3nsor commented 3 weeks ago

There's also the interesting fact that direct-initialization of a reference to T (using parentheses) never uses the rules for direct-initialization of a T object. So, while X x(z); is well-formed (though Clang rejects it, as reported in the bug linked above) the same is not true of X&& x(z). This curiosity was noted during discussion of CWG2709. I don't know why it's like that, but it is what it is. Changing it would require a paper and implementation experience.

ranaanoop commented 3 weeks ago

@t3nsor Note that clang rejects X x = {z};. Demo just like it rejects X&& ref{z};. I think clang is having problem with X x = {z}; which is why it also rejects X&& ref{z};. Basically, as the validity of X x = {z;} is part of the validity of X&& ref{z};.

t3nsor commented 3 weeks ago

Oh yes, you're right. But all compilers accept const X& r = z, so there is no rationale for Clang to reject X x = {z};.

t3nsor commented 3 weeks ago

No, wait, I see what the problem is now. It's similar to the issue Richard pointed out: Clang doesn't like X x = {z}; because it thinks the move constructor is also viable. In other words Clang considers there to be an implicit conversion sequence from z to X&& even though the corresponding copy-initialization X&& r = z; would be ill-formed.

ranaanoop commented 3 weeks ago

I also thought this but then looked at the clang's error which suggests that it doesn't consider the move ctor viable as the clang error says:

note: candidate constructor not viable: no known conversion from 'Z' to 'X &&' for 1st argument 7 | X(X&&); | ^ ~~~

Also, it says same for copy ctor not being viable. If the error were to say something like ambiguous conversion then we could've said that clang also considers move ctor to be viable. Maybe the error message is also faulty.

Note that if we explicitly declare/define the copy ctor without declaring/defining the move ctor then clang starts accepting it as well. Demo

struct X
{
    public:
    X(){}
    X(const X&) //clang starts accepting this 
    {

    };

};
X x = {z}; //now clang accepts this
t3nsor commented 4 days ago

I'm going to assume that's just a problem with Clang's diagnostics.

So, anyway, the issue of whether there's an implicit conversion sequence from z to X&& is a Clang bug. As Richard points out, the standard is not incredibly clear, but on the other hand, he also says: "It seems pretty clear to me that the move constructor should not be viable in this example." Really, the wording issue here is just a particular case of [CWG2525].

Going back to X&& r{z}, that means it should be accepted, Clang only rejects it because of the aforementioned bug.

So we only have two new issues here:

  1. Do we want to say "direct-non-list-initialize" or do we want to say ( E ) or whatever...
  2. Do we want to reverse a possibly unintentional effect of CWG1604 as mentioned here https://github.com/cplusplus/CWG/issues/596#issuecomment-2306048747