llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.93k stars 11.94k forks source link

clang rejects valid `(T())(args...)` function call with pack expansion #64926

Open Eisenwave opened 1 year ago

Eisenwave commented 1 year ago

Code to Reproduce

https://godbolt.org/z/Ef9P8fhP6

template<typename T, typename... Args>
void f(Args... args) {
    (T())(args...);
}

Output

<source>: In function 'void f(Args ...)':
<source>:3:19: error: expected ')' before '...' token
    3 |         (T())(args...);
      |              ~    ^~~
      |                   )

Explanation

(T())(args...) should be parsed as a:

This is how (T())() is interpreted (minus the pack expansion of course), and the same interpretation should apply to (T())(args...).

RealLitb commented 1 year ago

I agree with the above interpretation about (T())(args...), because args... is a call argument list. Wanna focus on (T())() instead:

There was once a discussion on C++ std core about this case

struct A {
   operator int();
   void foo() {
      operator int();
   }
};

Why operator int(); is not parsed as an invalid block-scope declaration of a function with the name operator int. As far as I can see, while many commented, no core issue was created. Now in the case above, where we could have (T())();, why is it not a declaration as well? Of a function with the name T, that illegally returns another function and that misses a return type?

template<typename T, typename... Args>
void f(Args... args) {
    (T())();
}

The intent seems clear, but I'm not sure whether we have the correct formal rules in the C++ spec.

Eisenwave commented 1 year ago

Now in the case above, where we could have (T())();, why is it not a declaration as well? Of a function with the name T, that illegally returns another function and that misses a return type?

It's not a function declaration because as you've said yourself, it's missing a return type.

T(); // not a function declaration
int T(); // function declaration

it's really that simple, and the same rule applies to (T())(), which is exactly the same "ambiguity". There is no actual ambiguity because only one parsing would be valid.

cpplearner commented 1 year ago

Today a function declaration without a specified return type is a nodeclspec-function-declaration, which is not a block-declaration and thus not a declaration-statement, so such a declaration cannot appear within a compound-statement.

RealLitb commented 1 year ago

Today a function declaration without a specified return type is a nodeclspec-function-declaration, which is not a block-declaration and thus not a declaration-statement, so such a declaration cannot appear within a compound-statement.

Thanks, that's what I've been missing!

llvmbot commented 1 year ago

@llvm/issue-subscribers-clang-frontend

cor3ntin commented 1 year ago

Per [dcl.ambig.res]/p2, anything that looks like a type-id is considered one. T() can be a function returning a T and taking no parameter, so it is initially considered as such. From there, (T())(args...) is a cast expression, and that does not allow for pack expansion. It is however not a valid cast expression, so it should not be treated as one.

So, there is probably a bug in clang there, after much thinking

(I edited this mensage, my initial analysis was quite wrong)

RealLitb commented 1 year ago

Per [dcl.ambig.res]/p2, anything that looks like a type-id is considered one. T() can be a function returning a T and taking no parameter, so it is initially considered as such. From there, (T())(args...) is a cast expression, and that does not allow for pack expansion. It is however not a valid cast expression, so it should not be treated as one.

So, there is probably a bug in clang there, after much thinking

(I edited this mensage, my initial analysis was quite wrong)

I don't understand this comment. Wasn't it established already that Clang has a bug in marking it ill-formed? As your link says, the resolution applies only for ambiguities. Unless you can show that the syntax allows it to be a cast expression, the linked text is not relevant for this example.

cor3ntin commented 1 year ago

I don't understand this comment. Wasn't it established already that Clang has a bug in marking it ill-formed? As your link says, the resolution applies only for ambiguities. Unless you can show that the syntax allows it to be a cast expression, the linked text is not relevant for this example.

(T())(foo) can be a cast expression, so we try that first. (T())(foo...) can't be, but we only know that once we reach ..., at which point we need to reparse the whole thing as a postfix expression.

https://reviews.llvm.org/D158718

cor3ntin commented 1 year ago

I had a chat with @AaronBallman and we realize this is https://github.com/cplusplus/papers/issues/1376, and the approach taken in the PR is wrong. The wording to intend T() to always be a type-id even if the expression would be invalid.

As noted by WG21 this is less than useful, however disambiguating packs does not help the other cases such as (int()) + 5, so it seems it is clearly the wrong approach.

I'll close the review, and this is not a bug in clang but rather a potential defect in C++ (although that seems debated).

RealLitb commented 1 year ago

I had a chat with @AaronBallman and we realize this is cplusplus/papers#1376, and the approach taken in the PR is wrong. The wording to intend T() to always be a type-id even if the expression would be invalid.

As noted by WG21 this is less than useful, however disambiguating packs does not help the other cases such as (int()) + 5, so it seems it is clearly the wrong approach.

I'll close the review, and this is not a bug in clang but rather a potential defect in C++ (although that seems debated).

Why is this https://github.com/cplusplus/papers/issues/1376 ? In the linked page, there is an ambiguity, because "+5" is an expression that can be casted to a type using "(type) expression" (and it can also be an addition "0+5"). Therefore the "T()" in that example can be a type-id in its syntactic context.

But as established above, that's not true for "(T())(args...)". Here, the "(args...)" is not simply "invalid" (as to cause semantic analysis to fail after parsing), but that thing cannot even be parsed as an "invalid cast". The "T()" in this case cannot be a type-id in its context.

Am I missing something obvious?

cor3ntin commented 1 year ago

@zygoloid Do you have an opinion?