katef / libfsm

DFA regular expression library & friends
BSD 2-Clause "Simplified" License
930 stars 51 forks source link

Grammar railroad diagram #460

Open mingodad opened 5 months ago

mingodad commented 5 months ago

I've just added the lx grammar to https://mingodad.github.io/parsertl-playground/playground/ an Yacc/Lex compatible editor/tester (select Libfsm lx parser from Examples then click Parse to see the parse tree for the content in Input source).

I've also created an EBNF understood by https://rr.red-dove.com/ui to generate a nice navigable railroad diagram (copy and paste the EBNF shown bellow at https://rr.red-dove.com/ui in the Edit grammar tab then switch to the View Diagram tab).

unit ::=
    command-list eof

command-list ::=
    /*empty*/
    | command command-list

command ::=
    semicolon
    | white-defn semicolon
    | group-defn semicolon
    | trigraph-defn semicolon
    | token-defn semicolon
    | keyword-defn semicolon
    | default-defn semicolon
    | type-defn semicolon
    | action-decl semicolon
    | open-brace command-list close-brace
    | zone-defn

white-defn ::=
    kw-group white equal string-plus

string-plus ::=
    (string
        | upper
        | lower
        | digit) (plus string-plus)

group-defn ::=
    kw-group ident equal string-plus

trigraph-defn ::=
    kw-mapping string-plus arrow string-plus

token-defn ::=
    kw-token string-plus arrow cmd-list

cmd-list ::=
    cmd
    | cmd-list comma cmd

cmd ::=
    sid-ident
    | discard
    | action-call

action-call ::=
    (/*empty*/
        | lhs-tuple equal) begin-action ident end-action rhs-tuple

lhs-tuple ::=
    (arg-char-string
        | arg-char-num
        | arg-char-count
        | string
        | ident
        | ref ident
        | sid-ident
        | arg-return
        | arg-ignore)
    | open lhs-tuple1 close

lhs-tuple1 ::=
    (arg-char-string
        | arg-char-num
        | arg-char-count
        | string
        | ident
        | ref ident
        | sid-ident
        | arg-return
        | arg-ignore)
    | (arg-char-string
        | arg-char-num
        | arg-char-count
        | string
        | ident
        | ref ident
        | sid-ident
        | arg-return
        | arg-ignore) comma lhs-tuple1

rhs-tuple ::=
    open rhs-tuple1 close

rhs-tuple1 ::=
    rhs-arg
    | rhs-arg comma rhs-tuple1

rhs-arg ::=
    arg-char-string
    | arg-char-num
    | arg-char-count
    | sid-ident
    | arg-return
    | arg-ignore
    | string
    | ref ident
    | ident

keyword-defn ::=
    kw-keyword string arrow cmd

default-defn ::=
    kw-token default arrow cmd-list

type-defn ::=
    kw-type ident

action-decl ::=
    kw-action begin-action ident end-action (colon open param-list1 close arrow open param-list1 close)

param-list1 ::=
    param
    | param comma param-list1

param ::=
    ident colon ident ref

zone-defn ::=
    kw-zone ident colon string-plus (arrow cmd-list) (range
        | range-closed-closed
        | range-closed-open) string-plus (arrow cmd-list) open-brace command-list close-brace

// Tokens

kw-type ::= "TYPE"
kw-group ::= "GROUP"
kw-action ::= "ACTION"
kw-keyword ::= "KEYWORD"
kw-zone ::= "ZONE"
kw-mapping ::= "MAPPING"
kw-token ::= "TOKEN"

default ::= "DEFAULT"
white ::= "white"

open ::= "("
close ::= ")"
begin-action ::= "<"
end-action ::= ">"
open-brace ::= "{"
close-brace ::= "}"
arrow ::= "->"
colon ::= ":"
ref ::= "&"
semicolon ::= ";"
equal ::= "="
plus ::= "+"
discard ::= "$$"
comma ::= ","
eof ::= "\e"

range ::= "..."
range-closed-closed ::= "\[...]"
range-closed-open ::= "\[...)"

arg-char-count ::= "#n"
arg-char-string ::= "#*"
arg-return ::= "$"
arg-ignore ::= "!"

upper ::= "{A-Z}"
lower ::= "{a-z}"
digit ::= "{0-9}"
katef commented 4 months ago

Oh! Thank you!

This isn't actually lx's grammar (see /src/lx/lexer.lx in this repo), this is Lexi's grammar! See /lexi/src/lexer.lxi in the tendra repo). So this actually belongs to @tendra, not libfsm.

Do you know how these got confused? Maybe we can untangle that!