0x6563 / grammar-well

Grammar Well is a cross-platform compiler, parser, and/or interpreter written in TypeScript.
GNU Lesser General Public License v3.0
25 stars 1 forks source link

Grammar railroad diagram #5

Open mingodad opened 3 weeks ago

mingodad commented 3 weeks ago

Would be nice if this tool could create an EBNF accepted by (IPV6) https://www.bottlecaps.de/rr/ui or (IPV4) https://rr.red-dove.com/ui to generate grammar railroad diagrams (see example bellow).

You could also add more complex grammars to show the capabilities of this tool like in https://mingodad.github.io/parsertl-playground/playground/ .

//
// EBNF to be viewd at
//    (IPV6) https://www.bottlecaps.de/rr/ui
//    (IPV4) https://rr.red-dove.com/ui
//
// Copy and paste this at one of the urls shown above in the 'Edit Grammar' tab
// then click the 'View Diagram' tab.
//
    main ::=
        _ section_list _ 

    section_list ::=
        section 
        | section T_WS section_list 

    section ::=
        K_CONFIG _ L_COLON _ L_TEMPLATEL _ kv_list _ L_TEMPLATER 
        | K_IMPORT _ L_STAR _ K_FROM __ T_WORD _ L_SCOLON 
        | K_IMPORT _ L_STAR _ K_FROM __ T_STRING _ L_SCOLON 
        | K_LEXER _ L_COLON _ L_TEMPLATEL _ lexer _ L_TEMPLATER 
        | K_GRAMMAR _ L_COLON _ L_TEMPLATEL _ grammar _ L_TEMPLATER 
        | K_BODY _ L_COLON _ T_JS
        | K_BODY _ L_COLON _ T_STRING
        | K_HEAD _ L_COLON _ T_JS
        | K_HEAD _ L_COLON _ T_STRING

    lexer ::=
        kv_list _ state_list 
        | state_list 

    state_list ::=
        state 
        | state _ state_list 

    state ::=
        state_declare _ state_definition 

    state_declare ::=
        T_WORD _ L_ARROW 

    state_definition ::=
        kv_list _ token_list 
        | token_list 

    token_list ::=
        token 
        | token _ token_list 

    token ::=
        L_DASH _ K_IMPORT _ L_COLON _ word_list 
        | L_DASH _ token_definition_list 

    token_definition_list ::=
        token_definition 
        | token_definition _ token_definition_list 

    token_definition ::=
        K_TAG _ L_COLON _ string_list 
        | K_WHEN _ L_COLON _ T_STRING 
        | K_WHEN _ L_COLON _ T_REGEX 
        | K_POP 
        | K_POP _ L_COLON _ T_INTEGER 
        | K_POP _ L_COLON _ K_ALL 
        | K_HIGHLIGHT _ L_COLON _ T_STRING 
        | K_INSET 
        | K_INSET _ L_COLON _ T_INTEGER 
        | K_SET _ L_COLON _ T_WORD 
        | K_GOTO _ L_COLON _ T_WORD 
        | K_TYPE _ L_COLON _ T_STRING 

    grammar ::=
        kv_list _ grammar_rule_list 
        | grammar_rule_list 

    grammar_rule_list ::=
        grammar_rule 
        | grammar_rule _ grammar_rule_list 

    grammar_rule ::=
        T_WORD _ L_ARROW _ expression_list 
        | T_WORD  __ L_COLON _ T_JS_ L_ARROW _ expression_list
        | T_WORD  __ L_COLON _ T_GRAMMAR_TEMPLATE_ L_ARROW _ expression_list

    expression_list ::=
        expression
        | expression_list _ L_PIPE _ expression 

    expression ::=
        expression_symbol_list 
        | expression_symbol_list __ L_COLON _ T_JS
        | expression_symbol_list __ L_COLON _ T_GRAMMAR_TEMPLATE

    expression_symbol_list ::=
        expression_symbol
        | expression_symbol_list T_WS expression_symbol 

    expression_symbol ::=
        expression_symbol_match 
        | expression_symbol_match L_COLON T_WORD 
        | expression_symbol_match expression_repeater 
        | expression_symbol_match expression_repeater L_COLON T_WORD 

    expression_symbol_match ::=
        T_WORD 
        | T_STRING "i"? 
        | L_DSIGN T_WORD 
        | L_DSIGN T_STRING 
        | T_REGEX 
        | L_PARENL _ expression_list _ L_PARENR 
        | T_JS 

    expression_repeater 
        L_QMARK
        | L_PLUS
        | L_STAR

    kv_list ::=
        kv 
        | kv _ kv_list 

    kv ::=
        T_WORD _ L_COLON _ ( T_WORD| T_STRING| T_INTEGER | T_JS | T_GRAMMAR_TEMPLATE) 

    string_list ::=
        T_STRING 
        | T_STRING _ L_COMMA _ string_list 

    word_list ::=
        T_WORD 
        | T_WORD _ L_COMMA _ word_list 

    _ ::=
        ( T_WS | T_COMMENT )* 

    __ ::=
        ( T_WS | T_COMMENT )+ 

    L_COLON ::= ":"
    L_SCOLON ::= ";"
    L_QMARK ::= "?"
    L_PLUS ::= "+"
    L_STAR ::= "*"
    L_COMMA ::= ","
    L_PIPE ::= "|"
    L_PARENL ::= "("
    L_PARENR ::= ")"
    L_TEMPLATEL ::= "{{"
    L_TEMPLATER ::= "}}"
    L_ARROW ::= "->"
    L_DSIGN ::= "$"
    L_DASH ::= "-"

    K_ALL ::= "all"
    K_TAG ::= "tag"
    K_FROM ::= "from"
    K_TYPE ::= "type"
    K_WHEN ::= "when"
    K_POP ::= "pop"
    K_HIGHLIGHT ::= "highlight"
    K_INSET ::= "inset"
    K_SET ::= "set"
    K_GOTO ::= "goto"
    K_CONFIG ::= "config"
    K_LEXER ::= "lexer"
    K_GRAMMAR ::= "grammar"
    K_IMPORT ::= "import"
    K_BODY ::= "body"
    K_HEAD ::= "head"
0x6563 commented 2 weeks ago

Hi,

Generating different EBNFs is definitely a goal. Part of what I have been working on in V2 is being able to format the grammar output, but the current implementation is reliant on the AST which does not include comments (see: https://github.com/0x6563/grammar-well/blob/v2/src/generator/stringify/grammar/v2.ts). When I adjust the AST to be more akin to a CST then I can start writing exports for other EBNFs. But I am not sure if it will be in this repo or over in https://grammar-well.xyz/tools/migration

As for generating an EBNF for the suggested tool, I have some concerns with interoperability. Specifically the suggested tool supports character classes while Grammar-Well supports full regex expressions for non terminals as well as token classes. While I can do "best efforts" translation and format unsupported expressions as literals, it defeats the purpose of railroad diagrams.

But I do plan on implementing railroad diagrams and other tools over at the document site. So if nothing else, I'll write the tooling myself.

As far as the playground I do have something similar at https://0x6563.github.io/grammar-well-editor and https://grammar-well.xyz/tools/editor where I have been migrating everything to. Currently it's missing the AST diagram tool but the parsing functionality is there. Is there some functionality that you have in mind that you think could highlight the capabilities, or just adding a larger collection of samples?

image

mingodad commented 2 weeks ago

Thank you for reply ! Looking at your playground your playground I found that you do not show lie/column on the json/ast if they are there would be nice to click on the ast node and have the cursor jump on the sample source (and why not in the grammar too).

mingodad commented 2 weeks ago

And for the samples would be nice to have grammars for the main parser generators (bison/yacc, antlr, peg, lezer, tree-sitter, ...)