p4lang / p4-spec

Apache License 2.0
175 stars 80 forks source link

Question: P4-16 headerUnionDeclaration: definition / appearance #286

Closed SmartHoneybee closed 6 years ago

SmartHoneybee commented 7 years ago

The definition of derivedTypeDeclaration in p4-16/spec/P4-16-spec.mdk mentions headerUnionDeclaration twice, the definition in p4-16/spec/grammar.mdk only once. Why?

In p4-16/spec/P4-16-spec.mdk headerUnionDeclaration is defined as:

headerUnionDeclaration
    : optAnnotations HEADER_UNION name { structure.declareType(*$3); }
      '{' structFieldList '}'          { $$ = new IR::Type_Union(@3, *$3, $1, *$6); }
    ;

but in p4-16/spec/grammar.mdk as:

headerUnionDeclaration
    : optAnnotations HEADER_UNION name
      '{' structFieldList '}'
    ;

I'm guessing that the latter is the "correct definition". Is that correct?

Thanks.

jnfoster commented 7 years ago

Indeed. That looks like a typo (the actual Bison code got copied into the Markdown).

SmartHoneybee commented 7 years ago

Thank you for responding.

I found some other potential typos while studying the grammar and would appreciate feedback if my interpretation is right.

mihaibudiu commented 7 years ago

We should fix these bugs. I will submit a PR to correct them. Thank you.

jafingerhut commented 7 years ago

Thanks, @mbudiu-vmw for the fixes. There is one remaining occurrence of "derivedTypeDeclarationame" in file grammar.mdk that should be replaced with "derivedTypeDeclaration".

SmartHoneybee commented 7 years ago

Awesome, thank you for the timely responses.

When is the next P416 specification release scheduled? (I.e. when will those changes become part of a published specification version?)

SmartHoneybee commented 7 years ago

Actually, I just stumbled over something else: in section 6.3.1. name is define as:

name
    : nonTypeName
    | TYPE
    ;

but the P4 grammar defines it as:

name
    : nonTypeName
    | TYPE
    | ERROR
    ;
mihaibudiu commented 7 years ago

I believe that as soon as the change is merged a new version is pushed automatically. Since these are just bug-fixes we don't need a more complicated process.

jnfoster commented 7 years ago

That's not quite right. We allow the repository to be updated automatically when any of the key committers merge a pull request, but the official spec is only updated by hand. We could consider having a pointer to the "live" version of the spec, but the current web page only lists version 1.0.0.

http://p4lang.github.io/p4-spec/

I'd like to release a minor typo/bugfix version later this summer. There are a few other fixes in the works. Perhaps we'll do so after those converge?

-N

On Thu, Jun 15, 2017 at 9:42 AM, Mihai Budiu notifications@github.com wrote:

I believe that as soon as the change is merged a new version is pushed automatically. Since these are just bug-fixes we don't need a more complicated process.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/p4lang/p4-spec/issues/286#issuecomment-308798608, or mute the thread https://github.com/notifications/unsubscribe-auth/ABwi0qoS56YMeTCFgFxRSAfAjpOcg6kqks5sEV8OgaJpZM4N1NhE .

SmartHoneybee commented 7 years ago

A new patch/minor version later this summer does sound good.

Maybe something else to include:

entry
 : keysetExpression ':' actionRef optAnnotations ';'

in section 12.2.1.4. versus the P4 grammar:

entry
    : optAnnotations keysetExpression ':' actionRef ';'
SmartHoneybee commented 7 years ago

entryList in entriesList does not exists. (I guess this is supposed to be entriesList as well.)

By the way: fell free to close if you think all issues addressed here are fixed.

Thanks for your work.

jafingerhut commented 7 years ago

If someone felt inclined to do so, it would be nice if there were a program that extracted all of the fragments of the grammar within P4Grammar begin / P4Grammar end lines in the P4-16-spec.mdk, and compared it to the contents of grammar.mdk, reporting any differences. Such a program would need to be able to allow the grammar rules to be in different orders, and still report no problems if the same set of productions are in both places.

cc10512 commented 7 years ago

Or make it work with the file fragments include in madoko.

mihaibudiu commented 7 years ago

It looks to me like @SmartHoneybee is doing exactly this, by hand. The golden standard for the grammar is really the compiler implementation. Unfortunately the corresponding file in the compiler source tree frontends/parsers/p4/p4parser.ypp is much more complicated due to semantic actions, and it also has some additional experimental features currently not part of the official spec.

jafingerhut commented 7 years ago

@mbudiu-vmw I agree that a simple tool to auto-compare the compiler's grammar and grammar.mdk would not be easy -- I did a tkdiff / Emacs ediff between the compiler version and grammar.mdk and found the experimental things you refer to in the compiler, but it does make ignoring the C/C++ code in the Bison rules easier to ignore visually.

@cc10512 I didn't know about Madoko's file fragments feature. Looks useful for something like this. Of course it would be more robust to use the BEGIN/END 'named' fragment variant of that feature, since line numbers would be too easy to break them all when grammar.mdk changes.

cc10512 commented 7 years ago

It does have a BEGIN/END feature to mark sections. Unfortunately they will show up in the overall grammar.mdk, which you don't want. Also because of the typesetting as code, it is not that easy. Ideally, one would have a grammar_markup.mdk that defines the fragments which get included and is a base to generate a clean version of grammar.mdk.

ChrisDodd commented 7 years ago

Fragments can be nested, so you would have a WholeGrammar "fragment" that you use when you want the whole grammar with all the fragment markers stripped out.

On Fri, Jun 23, 2017 at 11:44 AM, Calin Cascaval notifications@github.com wrote:

It does have a BEGIN/END feature to mark sections. Unfortunately they will show up in the overall grammar.mdk, which you don't want. Also because of the typesetting as code, it is not that easy. Ideally, one would have a grammar_markup.mdk that defines the fragments which get included and is a base to generate a clean version of grammar.mdk.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/p4lang/p4-spec/issues/286#issuecomment-310743907, or mute the thread https://github.com/notifications/unsubscribe-auth/AD4c8Sd3uIlY-1jm5MiywfJUAJFCNjMQks5sHAemgaJpZM4N1NhE .

SmartHoneybee commented 7 years ago

Yes, these errors where found 'by hand' (that is while reading the specification).

The statement that p4c (That is the compiler you, @mbudiu-vmw, are referring to, right?) defines the standard and not the specification unsettles me. Why publish something called language specification if what really defines the language is still considered 'alpha-quality'? </bike-shedding>

Since I think that my difficulties are caused by differences between P4-16-spec.mdk and grammar.mdk (please do correct me if you disagree). Here a first attempt to find such differences:

#!/usr/bin/env python3
"""
Comparing `P4-16-spec.mdk` with `grammar.mdk` rule by rule.
"""

import re
import difflib

SPECIFICATION = 'P4-16-spec.mdk'
GRAMMAR = 'grammar.mdk'

def main():
    """ Loops over rules in each P4Grammar section. """
    grammar_end = re.compile('^~ End P4Grammar$')
    grammar_start = re.compile('^~ Begin P4Grammar$')
    grammar_new_rule = re.compile('^[A-z0-9]+$')
    is_grammar = False
    p4_gram = list()
    p4_spec = list()
    p4_issues = list()
    with open(SPECIFICATION, 'r') as specification, open(GRAMMAR, 'r') as grammar:
        p4_gram = grammar.readlines()
        p4_gram = p4_gram[1:len(p4_gram)-1]
        for line in specification:
            if is_grammar:
                if grammar_end.match(line):
                    is_grammar = False
                    p4_issues.append(compare_first_definition(p4_spec, p4_gram))
                    continue
                elif grammar_new_rule.match(line) and len(p4_spec) > 1:
                    p4_issues.append(compare_first_definition(p4_spec, p4_gram))
                    p4_spec = list()
                p4_spec.append(line)
            else:
                if grammar_start.match(line):
                    is_grammar = True
                    continue
        print(''.join(map(''.join, filter(None, p4_issues))))

def compare_first_definition(specification, grammar):
    """ Compute diff for current rule. """
    try:
        rule_name = specification[0]
        rule_line = grammar.index(rule_name)
        rule_end = rule_line + len(specification)
        return list(difflib.context_diff(specification, grammar[rule_line:rule_end]))
    except ValueError:
        raise ValueError('Unable to find rule ' + rule_name + ' in grammar.')

if __name__ == '__main__':
    main()

Please note that the paths for the grammar files are hard coded.

Thank you very much for looking into this in such detail, I did not expect that.

jnfoster commented 7 years ago

The statement that p4c https://github.com/p4lang/p4c (That is the compiler you, @mbudiu-vmw https://github.com/mbudiu-vmw, are referring to, right?) defines the standard and not the specification unsettles me. Why publish something called language specification if what really defines the language is still considered 'alpha-quality'?

Don't fret: p4c aspires to be a reference implementation, but where there are a discrepancies, the spec is definitive.

Thanks for the script!

-N

SmartHoneybee commented 7 years ago

Hello,

please consider updating

returnStatement
    : RETURN ';'
    ;

to (the BNF equivalent of) the version used by p4c:

returnStatement
    : RETURN ';'
    | RETURN expression ';'
    ;

Thanks.

mihaibudiu commented 7 years ago

This is an experimental feature in the compiler. Maybe this will appear in a future version of the spec.

jnfoster commented 6 years ago

I just checked that all of these issues have been repaired, except for the last one which is an experimental feature. Closing this now.