GabrielDosReis / ipr

Compiler-neutral Internal Program Representation for C++
BSD 3-Clause "New" or "Revised" License
220 stars 23 forks source link

How to represent `if (++x; int z = f(x)) body;` (how to deal with init-statement) #103

Closed GorNishanov closed 2 years ago

GorNishanov commented 3 years ago

C++17 introduced a change that allows optional init-statement to if, switch and for

if ( init-statement_opt condition ) statement
switch ( init-statement_opt condition ) statement
for ( init-statement condition_opt ; expressionopt ) statement
for ( init-statement_opt for-range-declaration : for-range-initializer ) statement

The motivation was to allow adding extra declaration before the condition, but, due to reuse of the grammar elements :-) it also ended up allowing arbitrary expression statements to appear there as well.

What is the proper way to represent if (++x; int z = f(x)) body; in IPR?

If_then(Expr, Stmt)

What should be an Expr argument to an If_then constructor?

During our discussions, we chatted that maybe if we change Region to be an expression (not a Node as it is today), and make a region to be a sequence of statements, then, a region can act as a condition in such if statement.

Xazax-hun commented 3 years ago

I'd like to leave here a curious example (or even abuse) of the new syntax, that IPR also needs to be able to represent.

    if (struct X { operator bool() { return true; }} x; x)
        ...
GabrielDosReis commented 3 years ago

@GorNishanov - could you summarize the analysis your presented to me and the conclusion you arrived at here? Thanks!

GorNishanov commented 3 years ago

I think my initial suggestion of making a region a condition of an if statement is sub-optimal and user hostile. :-) An ipr user that examines the condition() of an If_stmt, should encounter: 1) an expression, as in if (x < 5) ... 2) a declaration, as in if (int x = f()) ... 3) an Expr_list*(++i; int j = i), as in if (++i; int j = i) ...

Regions are created as needed and can be discovered by querying home/lexical region region. They don't need to be promoted to the level of a condition of an if statement.

Now, for the case 3, I used Expr_list* to indicate that it is something similar, but not exactly Expr_list as it does not have the correct type and value for this case. The type of an Expr_list is a Product of all the types of all of the elements of the list and the value of an Expr_list is a list of values of all of the elements. For the case 3, we need a list/or a pair which type is the type of the last element and the value is the value of the last element.

Possible candidates are:

Stmt_list allows arbitrary number of elements, but, restricts its elements to statements (due to chosen name). Semicolon binary is limited to just two, but allows to have generalized Expr as operands. Semicolon binary is also more precise in targeting this use case.

My current leaning is towards an operator Semicolon.

GorNishanov commented 3 years ago

After a discussion, we proceed with ipr::Semicolon expression (not a classic expression) with the caveat that it is meant to be used inside of a conditional expressions.

Xazax-hun commented 3 years ago

To summarize some offline (from the point of GitHub) discussions:

GabrielDosReis commented 3 years ago

@GorNishanov , @Xazax-hun -

a declaration, as in if (int x = f())

To declare x, you create a subregion of the Region that contains the if statement: you use it to declare_var(), and the resulting Var is the condition of the if statement. Note that the consequence and alternative branches of the if statement would need that region as their enclosing() region. You don't need that intervening region if the condition does not introduce any declaration.

I will address the horror in the init-statement abomination in a separate comment.

GabrielDosReis commented 3 years ago

To add to the summary of @Xazax-hun: The issue is that in the grammar production selection-statement the sub-production init-statement has a reasonable alternative (the simple-declaration alternative) and an abomination (the expression-statement alternative). The combination of the simple-declaration and condition grammar productions is well understood and make logical sense as a SuchThat node. On the other hand, the combination of the expression-statement and condition grammar productions forms a programming horror, which I am inclined to represent as Grammar_reuse_horror. The last piece of the puzzle is how this works with range-based for. The SuchThat node class can't be used in that context, and I think we should resist the temptation of reflecting grammar horrors into the semantics representation as is -- no grammar production is safe while the committee is in session.

BjarneStroustrup commented 3 years ago

"No man's life, liberty, or programming language is safe while the committee is in session" (apologies to Mark Twain").

On 6/2/2021 1:01 AM, Gabriel Dos Reis wrote:

no grammar production is safe while the committee is in session.

GabrielDosReis commented 2 years ago

Resolution: all these constructs

if ( init-statement_opt condition ) statement
switch ( init-statement_opt condition ) statement
for ( init-statement_opt for-range-declaration : for-range-initializer ) statement

where the init-statement is present are to be represented by nodes of the ipr::Where interface where the main() expression is the corresponding construct without the init-statement, and the init-statement is the attendant() expression. Furthermore, when init-statement introduces bindings (e.g. the attendant() is of type ipr::Scope), the actual implementation node class is ipr::impl::Where. The other alternative is handled by the implementation node class ipr::impl::Where_no_decl.