p4lang / p4c

P4_16 reference compiler
https://p4.org/
Apache License 2.0
680 stars 444 forks source link

newtype declarations and functionsDeclarations will conflict once toplevel functions are supported #1336

Closed SantiagoBautista closed 6 years ago

SantiagoBautista commented 6 years ago

Let us consider the two following p4programs.

Example 1:

typedef bit id;

type id<type> (type x) {
  return x;
}

Example 2:

typedef bit id;

type id id2;

With p4-16 1.1 grammar, both are syntactically correct.

Both have the two-tokens-sequence type id where id is a TYPE_IDENTIFIER and type is the TYPE token; but in one case this corresponds to a nonTypeName followed by a name; and in the other case this corresponds to the type keyword of a newtype declaration. So, when the parser has read the TYPE token and looks ahead to TYPE_IDENTIFIER, it could either reduce a nonTypeName out of TYPE, of shift to recognize a newtype declaration.

Hence, there is a conflict.

This conflict is not yet observable in the compiler, as toplevel functions are not yet implemented. Hence for now, the compiler rejects the first example for the wrong reason: it doesn't expect the opening parenthesis of a function definition.

But when support for toplevel function declarations is added, this conflict will arise.


If it is unclear, let me explain why the definition of function id is syntactically correct in example 1. The first time type appears, it specifies the return type of the function, and this is allowed because the return type can be a nonTypeName, as it can be a type parameter that has not been parsed yet, like here. Then there is id, that will be parsed as a TYPE_IDENTIFIER because of the previous typedef declaration; but that is ok because function names are allowed to be any name, so in particular a typeName. The second time type appears it is a type parameter, that can be a nonTypeName, so in particular type. The third time type appears, it is a typeRef, more specifically a typeName, since it was promoted as a type name by the parsed type parameter. The rest of the function declaration is less surprising.

SantiagoBautista commented 6 years ago

@mbudiu-vmw , this is the conflict I talked to you about in #1335 .

I found it while developing a P4 interpreter. By the way, this shows why having a light-weight p4-interpreter will be useful once it is finished: being easier to modify than a compiler, an interpreter can help to detect problems in the language changes even before we try to modify the compiler.

SantiagoBautista commented 6 years ago

@jnfoster , I believe you could find this conflict interesting.

mihaibudiu commented 6 years ago

The grammar checked-in already contains functions at the top level. Look for functionDeclaration.

SantiagoBautista commented 6 years ago

Yes, I know. What I am saying is that any attempt to transform the current grammar into an LR(1) parser will produce a shift/reduce conflict.

SantiagoBautista commented 6 years ago

Toplevel functions are supported by the current grammar, but not by the current compiler implementation: if you look for functionDeclaration in the compiler implementation you will see that currently they can only appear in objDeclaration (line 389) but not in top-level declarations (line 270).

Hence, the conflict I describe is already present in the grammar, but will not be observable in the implementation until toplevel function declarations are supported. That is what I meant.

mihaibudiu commented 6 years ago

I actually had tested this in a branch, but this was before merging the type PR. Do you have a proposal on how to fix this?

SantiagoBautista commented 6 years ago

I can think of two solutions:

mihaibudiu commented 6 years ago

Let's try the first solution. We don't control all the programs that people wrote.

SantiagoBautista commented 6 years ago

I guess there is no reason to keep this issue open.

SantiagoBautista commented 6 years ago

I just realized that there is another solution for this problem that might be better: In the typeOrVoid non-terminal, replace nonTypeName by IDENTIFIER. The advantage is that this way the TYPE keyword can be put back with the other keywords into nonTypeName , and can hence be used as a nonTypeName in more places in a program. The drawback is that the keywords apply, key, actions and state could not be used as return type for functions when using them as type variables.

What do you think?

mihaibudiu commented 6 years ago

So you make the grammar prettier but reject more programs? I prefer to optimize the opposite way.

SantiagoBautista commented 6 years ago

There are some programs that are rejected by the solution we took and that are not with this other solution: the ones that want to use type as a nonTypeName (for example as the name of an extern, as a type argument, or as an lvalue).

So both solutions reject programs, and the question is which set of programs are more likely to be written.

jnfoster commented 6 years ago

OTOH, using "apply", "key", etc. as a type variables seems mostly useful for entries to the Obfuscated P4_16 Programming Contest :-)

On Thu, Jun 14, 2018 at 1:23 PM, Mihai Budiu notifications@github.com wrote:

So you make the grammar prettier but reject more programs? I prefer to optimize the opposite way.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/p4lang/p4c/issues/1336#issuecomment-397373738, or mute the thread https://github.com/notifications/unsubscribe-auth/ABwi0nLWTgKJiLzGsm3FOLE71TBlLWZiks5t8pwGgaJpZM4UlK7U .

mihaibudiu commented 6 years ago

In practice neither of them will be probably written.

SantiagoBautista commented 6 years ago

Both solutions reject programs that will probably (and hopefully) never be written; and one of them makes the grammar prettier (which can avoid future problems) :)

mihaibudiu commented 6 years ago

The feel free to submit a PR with these changes and we'll take a look.

SantiagoBautista commented 6 years ago

Did at #1344 and p4-spec#652

SantiagoBautista commented 6 years ago

1344 and p4-spec#652 was merged, so I will close this issue :)