cplusplus / CWG

Core Working Group
23 stars 7 forks source link

CWG2679 [over.best.ics.general]/2: Definition of implicit conversion sequence #212

Open leni536 opened 1 year ago

leni536 commented 1 year ago

Full name of submitter: Lénárd Szolnoki

Reference (section label): [over.best.ics.general]/2

Issue description

There are several issues with the wording of [over.best.ics.general]/2:

Implicit conversion sequences are concerned only with the type, cv-qualification, and value category of the argument and how these are converted to match the corresponding properties of the parameter.

Problem 1: Implicit conversion sequences are not concerned whether argument is null pointer constant or not

From the wording it follows that implicit conversion sequences are not concerned whether or not an argument is a null pointer constant or not. From this reading a function call like foo(0) should not find void foo(void*) viable, as there is no implicit conversion sequence from a prvalue of type int to void *.

This conflicts with implementations in major compilers:

void foo(void *, int); // 1
void foo(short, short); // 2

void bar() {
    foo(0, 1); // calls first overload
}

https://godbolt.org/z/E8zME93cc

Problem 2: The description of conversion from the argument to the parameter is imprecise

The wording describes that each of cv, type and value category is converted to match the "corresponding property" of the parameter (so cv, type and ... parameters don't have value category). I don't think the standard defines how cv, type and value category are converted individually. It particularly does not make much sense for value category.

Problem 3: The paragraph only covers arguments that have cv-qualification, type and value category

Some arguments are not expressions. Initializer-lists lack all three of these properties.

An id-expression to a function or an address of an id-expression to a function may not have a definitive type, due to overload sets.

Edits

  1. I edited the description to include other problems with the wording besides the null pointer constant one. I deleted my suggested resolution, as it does not tackle all of these problems.
ecatmur commented 1 year ago

Since [over.ics.list] is in this clause, we should probably arguments of the form initializer-list as well.

I'm slightly worried that [over.ics.ref]/4 doesn't consider null pointer constants; clearly void*&& is expected to bind to literal 0 and not to a prvalue int with value 0.

Scanning [conv], I think that null pointer constants are all we have to worry about from that clause.

Completeness: well, MSVC gets this wrong, but everyone else correctly finds it ambiguous:

struct A;
void foo(A);
void foo(A&);
void bar(A& a) {
    foo(a);
}

Accessibility: fine, compilers ignore it and error later (or not). Ambiguity: fine, handled specially.

Virtualness: MSVC incorrectly accepts the following, but at least it resolves the overload correctly. Everyone else correctly refuses the pointer-to-member to pointer-to-virtual-base-member conversion:

struct B { int i; };
struct D : virtual B {};
void f(int D::*);
void f(...);
void g(int B::*p) {
    f(p);
}
chilippso commented 1 year ago

The example code favors the first overload because the 2nd argument (integer literal -> int -> Identity -> Exact Match) does not need to get narrowed to short like needed for the second overload (Integral conversions -> Conversion), i.e. it has a better rank.

If you explicitly cast the 2nd argument to short, the second overload gets selected because of identity match for 2nd parameter. There is of course no ambiguity because the first overload (with an int as second parameter) would require a integral promotion which has rank promotion, but the identity match with short is still better (exakt match).

If you change the 2nd parameter for the second overload from short to int/long, you will get ambiguity, because the first argument is always ambiguous since converting 0 to short as well as converting 0 to void* are conversions and have the same rank. It just depends on the second parameter.

If all shorts would be changed to ints, the second overload would be selected, since it had two times identity -> exact match (for both parameters). If the first overload changed its second parameter from int to short, there would be ambiguity again, since the compiler can not tell which conversion is "better": converting the first argument to void* or the second to short (both have rank: Conversion).

Therefore I do not see that this is an issue or related to "null-pointers".

ecatmur commented 1 year ago

So I would word [over.best.ics.general]/2 like this:

Other than when the argument is a null pointer constant or an initializer list (see below), implicit conversion sequences are concerned only with [...]

As for [over.ics.ref]/4, I think the null pointer constant conversion is subsumed by [over.ics.ref]/2, so I don't think a change is needed.

ecatmur commented 1 year ago

@leni536 you might want to change the issue title - it shows as "[section.label] Proposed issue title #212". The edit button is on the top right, next to the green "New issue" button.

leni536 commented 1 year ago

So I would word [over.best.ics.general]/2 like this:

Other than when the argument is a null pointer constant or an initializer list (see below), implicit conversion sequences are concerned only with [...]

Well, my other gripe with the original wording is that it completely mischaracterizes how conversions happen. It describes that each of cv, type and value category is converted to match the "corresponding property" of the parameter (so cv, type and ... parameters don't have value category). I don't think the standard defines how cv, type and value category are converted individually. It particularly does not make much sense for value category.

What actually happens is roughly like this:

  1. If the argument is an expression (as opposed to an initializer-list for example), then certain properties are stripped from the expression for further analysis (for example if the argument is a bit-field, it's not treated as a bit-field for further analysis).
  2. Copy initialization is considered from the argument to the type of the parameter, without considering accessibility of any user-defined conversions, and whether the selected user-defined conversion is deleted.

Anyway, my point is that based on this reasoning I don't find your suggested wording satisfactory, as it leaves the original imprecise wording regarding the conversion.

Maybe this can be treated as a separate issue though, but it's too much in the same area for me to ignore.

ecatmur commented 1 year ago

Hm, yeah, AIUI a parameter, not being an expression, doesn't have a value category (it has a type, which may be reference-qualified as well as cv qualified). Perhaps you might edit the original issue description to include your "other gripe"? It's fine to present two closely-related issues, especially as a single wording change would address both; CWG will split them up if they feel appropriate.

Being cheeky, perhaps you could also include my point about initializer-list arguments not having type, cv or value category.

leni536 commented 1 year ago

Yes, I will edit the description to list all the issues with [over.best.ics.general]/2.

In the meantime I found implementation divergence for a weird corner case (assuming Itanium and MSVC x86_64 ABIs):

struct S {
    long i:16;
};

void foo(char); // 1
void foo(int); // 2

void bar() {
    foo({(short)0}); // calls 2, promotion is better match than conversion
    foo({S{}.i}); // calls 2?
}

https://godbolt.org/z/MK9xWWPTe

So the argument is an initializer-list with a single bit-field element. The argument is not an expression, so [over.best.ics.general]/2 may or may not apply. S{}.i as a bit-field promotes to int ([conv.prom]/5), which would be a better match than conversion to char.

If a proposed wording treats initializer-lists separately to expressions then it should potentially deal with this issue.

Other expressions I would like to explore are id-expressions to functions or address of functions, which don't have a definitive type.

So in my mental model the possible arguments are from three distinct categories: expressions (excluding overload sets), initializer-lists and overload sets.

ecatmur commented 1 year ago

Oh dear. There's divergence even without introducing initializer-lists:

struct S {
    long i:1;
};
void foo(signed char);
void foo(int);
void bar() {
    foo(S{}.i);
}

Either [over.best.ics.general]/2 applies, and this is ambiguous (gcc, MSVC), or [conv.prom]/5 applies, and foo(int) is preferred (clang).

jensmaurer commented 1 year ago

The bit-field issue is the subject of CWG2485.

The wording in [over.best.ics.general] p2 will be changed by CWG2525; please reconsider the issues presented herein in light of that resolution.

jensmaurer commented 1 year ago

CWG2679

chilippso commented 1 year ago

I still don't agree that (current) [over.best.ics.general]/2

claims that only the type, cv-qualification, and value category of the argument are relevant in forming an implicit conversion sequence.

[over.best.ics.general]/1 states:

Implicit conversion sequences are concerned only with the type, cv-qualification, and value category of the argument and how these are converted to match the corresponding properties of the parameter.

Looking up the definition of concerned adjective from the Oxford Advanced Learner's Dictionary there are 4 meanings for "concerned":

  1. worried and feeling concern about something/somebody
  2. affected by something; involved in something
  3. concerned with something: interested in something; dealing with something
  4. be concerned to do something (formal): to think it is important to do something

At this point, I think we all can agree that the first and fourth definitions are not applicable in our case, since the paragraph is neither about feelings nor about "being concerned to do something". That leaves the second and third definitions for discussion. Note, that the dictionary itself gives the emphasize on the third definition (concerned with something).

According to my perception, this third definition / meaning is also the definition and intentional meaning for this paragraph. If we substitute "concerned" with that definition / meaning, it would state:

Implicit conversion sequences are interested only in the type, cv-qualification, and value category of the argument and how these are converted to match the corresponding properties of the parameter.

This makes a huge difference to the meaning of the sentence with the second definition / meaning, which would state:

Implicit conversion sequences are affected only by the type, cv-qualification, and value category of the argument and how these are converted to match the corresponding properties of the parameter.

Being concerned with only "one thing" (or more things) does not mean that you have to restrict yourself to only that one thing. It might just means being interested in only a particular thing, such as converting an argument that has a type, a cv qualification, and a value category to match a corresponding parameter, without paying attention to additional constraints.

And that is obviously the point of and the reason for the following Note [1]:

Other properties, such as the lifetime, storage class, alignment, accessibility of the argument, whether the argument is a bit-field, and whether a function is deleted, are ignored. So, although an implicit conversion sequence can be defined for a given argument-parameter pair, the conversion from the argument to the parameter might still be ill-formed in the final analysis.

The crux of the matter is the logical AND combination together with "these". The conversion sequence is interested only in the type, cv qualification and value category and how the argument together with and based on its type, cv qualification and value category can be transformed to match the parameter. This restriction is a relaxation of constraints, since it does not take into account possible additional aspects that could limit this conversion due to e.g. "lifetime, storage class, alignment, accessibility of the argument, whether the argument is a bit-field, and whether a function is deleted". It is somehow "including but not limited to" the type, cv qualification and value category.

So if there is anything to change at this point, I would suggest to change it like follows:

Implicit conversion sequences are concerned only with the type, cv-qualification, and value category of the argument and how these the argument is converted to match the corresponding properties of the parameter.

At least this is the conclusion I (and obviously most compilers) come to.


By the way, and despite the fact that it does not necessarily say the truth and nothing but the truth, ChatGPT also confirms:

Question:

What is the difference between "being only interested in some things" and "being interested only in some things"?

Answer:

There is a difference in emphasis between "being only interested in some things" and "being interested only in some things."

"Being only interested in some things" means that the person is interested in a limited number of things and nothing else.

"Being interested only in some things" means that the person is interested in those specific things, but they may also be interested in other things. The focus is on the specific things that the person is interested in, rather than the exclusivity of their interests.

For example, "I am only interested in sports and video games" means that the person is not interested in anything else. On the other hand, "I am interested only in sports and video games" means that the person is interested in those two things, but they may also be interested in other things.

Question:

Does this also holds true if "being interested in" is substituted with "being concerned with"?

Answer:

Yes, the same distinction between emphasis on exclusivity versus specificity applies when using "being concerned with" instead of "being interested in."

For example, "I am only concerned with sports and video games" means that the person is not concerned with anything else. On the other hand, "I am concerned only with sports and video games" means that the person is concerned with those two things, but they may also be concerned with other things.

In both cases, the use of "only" serves to limit the scope of what the person is interested in or concerned with. When "only" is placed before the verb, it emphasizes the exclusivity of the person's interests or concerns. When "only" is placed after the verb, it emphasizes the specificity of the person's interests or concerns.

ecatmur commented 1 year ago

The crux of the matter is the logical AND combination together with "these". The conversion sequence is interested only in the type, cv qualification and value category and how the argument together with and based on its type, cv qualification and value category can be transformed to match the parameter. This restriction is a relaxation of constraints, since it does not take into account possible additional aspects that could limit this conversion due to e.g. "lifetime, storage class, alignment, accessibility of the argument, whether the argument is a bit-field, and whether a function is deleted". It is somehow "including but not limited to" the type, cv qualification and value category.

If the Standard means "including but not limited to" it should say so.

At least this is the conclusion I (and obviously most compilers) come to.

The compilers are not following the clear meaning of this paragraph, as indeed it conflicts with the rules succeeding.

By the way, and despite the fact that it does not necessarily say the truth and nothing but the truth, ChatGPT also confirms:

The reasoning that ChatGPT is parroting applies to the placing of "only" before the object of a transitive verb versus between the object and a relative clause. The wording in question appears superficially similar to such cases, but in fact "concerned with" is a particle verb, so there is no relative clause here to be qualified by "only".

jensmaurer commented 1 year ago

This is getting a bit... tangential, it seems.

I think it's obvious that this paragraph says the wrong thing for 0 literals, initializer-lists and (arguably) overload sets. Does CWG2679 adequately reflect the concerns here?

ecatmur commented 1 year ago

It's good as it stands, but I think Lénárd would also point out that:

leni536 commented 1 year ago

On 10 January 2023 15:55:26 GMT, Ed Catmur @.***> wrote:

It's good as it stands, but I think Lénárd would also point out that:

  • "how these are converted" could be read as implying that the properties are converted individually and one at a time, whereas in fact the conversion is holistic; and

  • in any case, a parameter does not have a value category (value category is a property of an expression https://wg21.link/basic.lval#def:value_category), so "how [the type, cv-qualification, and value category of the argument] are converted to match the corresponding properties of the parameter" is nonsensical in that regard.

Indeed, thank you. I didn't have time to properly follow up yet.

Cheers, Lénárd