urweb / urweb

The Ur/Web programming language
http://www.impredicative.com/ur/
Other
809 stars 67 forks source link

Grammar railroad diagram #248

Open mingodad opened 2 years ago

mingodad commented 2 years ago

Using a bit of Lua like string pattern replacements and adding the tokens from the lexer manually to obtain an EBNF understood by https://www.bottlecaps.de/rr/ui we can have a nice railroad diagram (https://en.wikipedia.org/wiki/Syntax_diagram).

Copy and paste the EBNF shown bellow on https://www.bottlecaps.de/rr/ui in the tab "Edit Grammar" then switch to the tab "View Diagram".


file   ::= decls
       | SIG sgis

decls  ::=
       | decl decls

decl   ::= CON SYMBOL cargl2 kopt EQ cexp
       | LTYPE SYMBOL cargl2 EQ cexp
       | DATATYPE dtypes
       | DATATYPE SYMBOL dargs EQ DATATYPE CSYMBOL DOT path
       | VAL pat eargl2 copt EQ eexp
       | VAL REC valis
       | FUN valis

       | SIGNATURE CSYMBOL EQ sgn
       | STRUCTURE CSYMBOL EQ str
       | STRUCTURE CSYMBOL COLON sgn EQ str
       | FUNCTOR CSYMBOL LPAREN CSYMBOL COLON sgn RPAREN EQ str
       | FUNCTOR CSYMBOL LPAREN CSYMBOL COLON sgn RPAREN COLON sgn EQ str
       | OPEN mpath
       | OPEN mpath LPAREN str RPAREN
       | OPEN CONSTRAINTS mpath
       | CONSTRAINT cterm TWIDDLE cterm
       | EXPORT spath
       | TABLE SYMBOL COLON cterm pkopt commaOpt cstopt
       | INDEX eterm COLON eterm
       | INDEX eterm COLON eterm IN cterm
       | SEQUENCE SYMBOL
       | VIEW SYMBOL EQ query
       | VIEW SYMBOL EQ LBRACE eexp RBRACE
       | COOKIE SYMBOL COLON cexp
       | STYLE SYMBOL
       | TASK eapps EQ eexp
       | POLICY eexp
       | FFI SYMBOL ffi_modes COLON cexp

dtype  ::= SYMBOL dargs EQ barOpt dcons

dtypes ::= dtype
       | dtype AND dtypes

kopt   ::=
       | DCOLON kind
       | DCOLONWILD

dargs  ::=
       | SYMBOL dargs

barOpt ::=
       | BAR

dcons  ::= dcon
       | dcon BAR dcons

dcon   ::= CSYMBOL
       | CSYMBOL OF cexp

vali   ::= SYMBOL eargl2 copt EQ eexp

copt   ::=
       | COLON cexp

cstopt ::=
       | csts

csts   ::= CCONSTRAINT tname cst
       | csts COMMA csts
       | LBRACE LBRACE eexp RBRACE RBRACE

cst    ::= UNIQUE tnames

       | CHECK sqlexp

       | FOREIGN KEY tnames REFERENCES texp LPAREN tnames2 RPAREN pmodes

       | LBRACE eexp RBRACE

tnameW ::= tname

tnames ::= tnameW
       | LPAREN tnames2 RPAREN

tnames2::= tnameW
       | tnameW COMMA tnames2

pmode  ::= ON pkind prule

pkind  ::= DELETE
       | UPDATE

prule  ::= NO ACTION
       | RESTRICT
       | CASCADE
       | SET NULL

pmodes ::=
       | pmode pmodes

commaOpt::=
        | COMMA

pk     ::= LBRACE LBRACE eexp RBRACE RBRACE
       | tnames

pkopt  ::=
       | PRIMARY KEY pk

valis  ::= vali
       | vali AND valis

sgn    ::= sgntm
       | FUNCTOR LPAREN CSYMBOL COLON sgn RPAREN COLON sgn

sgntm  ::= SIG sgis END
       | mpath
       | sgntm WHERE CON path EQ cexp
       | sgntm WHERE LTYPE path EQ cexp
       | LPAREN sgn RPAREN

cexpO  ::=
       | EQ cexp

sgi    ::= LTYPE SYMBOL
       | CON SYMBOL cargl2 kopt cexpO
       | LTYPE SYMBOL cargl2 cexpO
       | DATATYPE dtypes
       | DATATYPE SYMBOL dargs EQ DATATYPE CSYMBOL DOT path
       | VAL SYMBOL COLON cexp

       | STRUCTURE CSYMBOL COLON sgn
       | SIGNATURE CSYMBOL EQ sgn
       | FUNCTOR CSYMBOL LPAREN CSYMBOL COLON sgn RPAREN COLON sgn
       | INCLUDE sgn
       | CONSTRAINT cterm TWIDDLE cterm
       | TABLE SYMBOL COLON cterm pkopt commaOpt cstopt
       | SEQUENCE SYMBOL
       | VIEW SYMBOL COLON cexp
       | CLASS SYMBOL
       | CLASS SYMBOL DCOLON kind
       | CLASS SYMBOL EQ cexp
       | CLASS SYMBOL DCOLON kind EQ cexp
       | CLASS SYMBOL SYMBOL EQ cexp
       | CLASS SYMBOL LPAREN SYMBOL DCOLON kind RPAREN EQ cexp
       | COOKIE SYMBOL COLON cexp
       | STYLE SYMBOL

sgis   ::=
       | sgi sgis

str    ::= STRUCT decls END
       | spath
       | FUNCTOR LPAREN CSYMBOL COLON sgn RPAREN DARROW str
       | FUNCTOR LPAREN CSYMBOL COLON sgn RPAREN COLON sgn DARROW str
       | spath LPAREN str RPAREN

spath  ::= CSYMBOL
       | spath DOT CSYMBOL

kind   ::= LBRACE kind RBRACE
       | kind ARROW kind
       | LPAREN kind RPAREN
       | UNDERUNDER
       | LPAREN ktuple RPAREN
       | CSYMBOL
       | CSYMBOL KARROW kind

ktuple ::= kind STAR kind
       | kind STAR ktuple

capps  ::= cterm
       | capps cterm

cexp   ::= capps
       | cexp ARROW cexp
       | SYMBOL kcolon kind ARROW cexp
       | CSYMBOL KARROW cexp

       | cexp PLUSPLUS cexp

       | FN cargs DARROW cexp
       | LBRACK cexp TWIDDLE cexp RBRACK DARROW cexp
       | CSYMBOL DKARROW cexp

       | LPAREN cexp RPAREN DCOLON kind

       | UNDER DCOLON kind
       | ctuple

kcolon ::= DCOLON
       | TCOLON

cargs  ::= carg
       | cargl

cargl  ::= cargp cargp
       | cargp cargl

cargl2 ::=
       | cargp cargl2

carg   ::= SYMBOL DCOLON kind
       | UNDER DCOLON kind
       | SYMBOL DCOLONWILD
       | UNDER DCOLONWILD
       | cargp

cargp  ::= SYMBOL
       | UNDER
       | LPAREN SYMBOL kopt ckl RPAREN

ckl    ::=
       | COMMA SYMBOL kopt ckl

path   ::= SYMBOL
       | CSYMBOL DOT path

cpath  ::= CSYMBOL
       | CSYMBOL DOT cpath

mpath  ::= CSYMBOL
       | CSYMBOL DOT mpath

cterm  ::= LPAREN cexp RPAREN
       | LBRACK rcon RBRACK
       | LBRACK rconn RBRACK
       | LBRACE rcone RBRACE
       | DOLLAR cterm
       | HASH CSYMBOL
       | HASH INT

       | path
       | path DOT INT
       | UNDER
       | MAP
       | UNIT
       | LPAREN ctuplev RPAREN

ctuplev::= cexp COMMA cexp
       | cexp COMMA ctuplev

ctuple ::= capps STAR capps
       | capps STAR ctuple

rcon   ::=
       | rpath EQ cexp
       | rpath EQ cexp COMMA rcon

rconn  ::= rpath
       | rpath COMMA rconn

rcone  ::=
       | rpath COLON cexp
       | rpath COLON cexp COMMA rcone

ident  ::= CSYMBOL
       | INT
       | SYMBOL

eapps  ::= eterm
       | eapps eterm
       | eapps LBRACK cexp RBRACK
       | eapps BANG

eexp   ::= eapps
       | FN eargs DARROW eexp
       | CSYMBOL DKARROW eexp
       | eexp COLON cexp
       | eexp MINUSMINUS cexp
       | eexp MINUSMINUSMINUS cexp
       | CASE eexp OF barOpt branch branchs
       | IF eexp THEN eexp ELSE eexp
       | bind SEMI eexp
       | eexp EQ eexp
       | eexp NE eexp
       | MINUS eterm
       | eexp PLUS eexp
       | eexp MINUS eexp
       | eapps STAR eexp
       | eexp DIVIDE eexp
       | eexp MOD eexp

       | eexp LT eexp
       | eexp LE eexp
       | eexp GT eexp
       | eexp GE eexp

       | eexp FWDAPP eexp
       | eexp REVAPP eexp
       | eexp COMPOSE eexp
       | eexp ANDTHEN eexp
       | eexp BACKTICK_PATH eexp

       | eexp ANDALSO eexp
       | eexp ORELSE eexp

       | eexp PLUSPLUS eexp

       | eexp CARET eexp

       | eapps DCOLON eexp

bind   ::= eapps LARROW eapps
       | eapps

eargs  ::= earg
       | eargl

eargl  ::= eargp eargp
       | eargp eargl

eargl2 ::=
       | eargp eargl2

earg   ::= patS
       | earga

eargp  ::= pterm
       | earga

earga  ::= LBRACK SYMBOL RBRACK
       | LBRACK SYMBOL DCOLONWILD RBRACK
       | LBRACK SYMBOL kcolon kind RBRACK
       | LBRACK SYMBOL TCOLONWILD RBRACK
       | LBRACK cexp TWIDDLE cexp RBRACK
       | LBRACK CSYMBOL RBRACK

eterm  ::= LPAREN eexp RPAREN
       | LPAREN etuple RPAREN

       | path
       | cpath
       | AT path
       | AT AT path
       | AT cpath
       | AT AT cpath
       | LBRACE rexp RBRACE
       | LBRACE RBRACE
       | UNIT

       | INT
       | FLOAT
       | STRING
       | CHAR

       | path DOT idents
       | LPAREN eexp RPAREN DOT idents
       | AT path DOT idents
       | AT AT path DOT idents

       | XML_BEGIN xml XML_END
       | XML_BEGIN XML_END
       | XML_BEGIN_END

       | LPAREN query RPAREN
       | LPAREN CWHERE sqlexp RPAREN
       | LPAREN SQL sqlexp RPAREN
       | LPAREN FROM tables RPAREN
       | LPAREN SELECT1 query1 RPAREN

       | LPAREN INSERT INTO texp LPAREN fields RPAREN VALUES LPAREN sqlexps RPAREN RPAREN
       | LPAREN enterDml UPDATE texp SET fsets CWHERE sqlexp leaveDml RPAREN
       | LPAREN enterDml DELETE FROM texp CWHERE sqlexp leaveDml RPAREN

       | UNDER

       | LET edecls IN eexp END
       | LET eexp WHERE edecls END

       | LBRACK RBRACK

edecls ::=
       | edecl edecls

edecl  ::= VAL pat EQ eexp
       | VAL REC valis
       | FUN valis

enterDml ::=
leaveDml ::=

texp   ::= SYMBOL
       | LBRACE LBRACE eexp RBRACE RBRACE

fields ::= fident
       | fident COMMA fields

sqlexps::= sqlexp
       | sqlexp COMMA sqlexps

fsets  ::= fident EQ sqlexp
       | fident EQ sqlexp COMMA fsets

idents ::= ident
       | ident DOT idents

etuple ::= eexp COMMA eexp
       | eexp COMMA etuple

branch ::= pat DARROW eexp

branchs::=
       | BAR branch branchs

patS   ::= pterm
       | pterm DCOLON patS
       | patS COLON cexp

pat    ::= patS
       | cpath pterm

pterm  ::= SYMBOL
       | cpath
       | UNDER
       | INT
       | MINUS INT
       | STRING
       | CHAR
       | LPAREN pat RPAREN
       | LBRACE RBRACE
       | UNIT
       | LBRACE rpat RBRACE
       | LPAREN ptuple RPAREN
       | LBRACK RBRACK

rpat   ::= CSYMBOL EQ pat
       | INT EQ pat
       | DOTDOTDOT
       | CSYMBOL EQ pat COMMA rpat
       | INT EQ pat COMMA rpat

ptuple ::= pat COMMA pat
       | pat COMMA ptuple

rexp   ::= DOTDOTDOT
       | rpath EQ eexp
       | rpath EQ eexp COMMA rexp

rpath  ::= path
       | CSYMBOL

xml    ::= xmlOne xml
       | xmlOne

xmlOpt ::= xml
       |

xmlOne ::= NOTAGS
       | tag DIVIDE GT

       | tag GT xmlOpt END_TAG
       | LBRACE eexp RBRACE
       | LBRACE LBRACK eexp RBRACK RBRACE

tag    ::= tagHead attrs

tagHead::= BEGIN_TAG
       | tagHead LBRACE cexp RBRACE

attrs  ::=
       | attr attrs

attr   ::= SYMBOL EQ attrv
       | SYMBOL

attrv  ::= INT
       | FLOAT
       | STRING
       | LBRACE eexp RBRACE

query  ::= query1 obopt lopt ofopt

dopt   ::=
       | DISTINCT

query1 ::= SELECT dopt select FROM tables wopt gopt hopt
       | query1 UNION query1
       | query1 INTERSECT query1
       | query1 EXCEPT query1
       | query1 UNION ALL query1
       | query1 INTERSECT ALL query1
       | query1 EXCEPT ALL query1
       | LBRACE LBRACE LBRACE eexp RBRACE RBRACE RBRACE

tables ::= fitem
       | fitem COMMA tables

fitem  ::= table2
       | LBRACE LBRACE eexp RBRACE RBRACE
       | fitem JOIN fitem ON sqlexp
       | fitem INNER JOIN fitem ON sqlexp
       | fitem CROSS JOIN fitem
       | fitem LEFT JOIN fitem ON sqlexp
       | fitem LEFT OUTER JOIN fitem ON sqlexp
       | fitem RIGHT JOIN fitem ON sqlexp
       | fitem RIGHT OUTER JOIN fitem ON sqlexp
       | fitem FULL JOIN fitem ON sqlexp
       | fitem FULL OUTER JOIN fitem ON sqlexp
       | LPAREN query RPAREN AS tname
       | LPAREN LBRACE LBRACE eexp RBRACE RBRACE RPAREN AS tname
       | LPAREN fitem RPAREN

tname  ::= CSYMBOL
       | LBRACE cexp RBRACE

table  ::= SYMBOL
       | SYMBOL AS tname
       | LBRACE LBRACE eexp RBRACE RBRACE AS tname

table2 ::= table

tident ::= SYMBOL
       | CSYMBOL
       | LBRACE LBRACE cexp RBRACE RBRACE

fident ::= CSYMBOL
       | LBRACE cexp RBRACE

seli   ::= tident DOT fident
       | sqlexp
       | sqlexp AS fident
       | tident DOT LBRACE LBRACE cexp RBRACE RBRACE
       | tident DOT STAR

selis  ::= seli
       | seli COMMA selis

select ::= STAR
       | selis

sqlexp ::= TRUE
       | FALSE

       | INT
       | FLOAT
       | STRING
       | CURRENT_TIMESTAMP

       | tident DOT fident
       | CSYMBOL

       | LBRACE eexp RBRACE

       | sqlexp EQ sqlexp
       | sqlexp NE sqlexp
       | sqlexp LT sqlexp
       | sqlexp LE sqlexp
       | sqlexp GT sqlexp
       | sqlexp GE sqlexp

       | sqlexp PLUS sqlexp
       | sqlexp MINUS sqlexp
       | sqlexp STAR sqlexp
       | sqlexp DIVIDE sqlexp
       | sqlexp MOD sqlexp

       | sqlexp CAND sqlexp
       | sqlexp OR sqlexp

       | sqlexp LIKE sqlexp
       | sqlexp DISTANCE sqlexp

       | NOT sqlexp
       | MINUS sqlexp

       | sqlexp IS NULL

       | CIF sqlexp CTHEN sqlexp CELSE sqlexp

       | LBRACE LBRACK eexp RBRACK RBRACE
       | LPAREN sqlexp RPAREN

       | NULL

       | COUNT LPAREN STAR RPAREN window
       | COUNT LPAREN sqlexp RPAREN window
       | sqlagg LPAREN sqlexp RPAREN window
       | RANK UNIT window
       | COALESCE LPAREN sqlexp COMMA sqlexp RPAREN
       | fname LPAREN sqlexp RPAREN
       | fname LPAREN sqlexp COMMA sqlexp RPAREN
       | LPAREN query RPAREN

window ::=
       | OVER LPAREN pbopt obopt RPAREN

pbopt  ::=
       | PARTITION BY sqlexp

fname  ::= SYMBOL
       | LBRACE eexp RBRACE

wopt   ::=
       | CWHERE sqlexp

groupi ::= tident DOT fident
       | tident DOT LBRACE LBRACE cexp RBRACE RBRACE

groupis::= groupi
       | groupi COMMA groupis

gopt   ::=
       | GROUP BY groupis

hopt   ::=
       | HAVING sqlexp

obopt  ::=
       | ORDER BY obexps
       | ORDER BY LBRACE LBRACE LBRACE eexp RBRACE RBRACE RBRACE

obitem ::= sqlexp diropt

obexps ::= obitem
       | obitem COMMA obexps
       | RANDOM popt

popt   ::=
       | LPAREN RPAREN
       | UNIT

diropt ::=
       | ASC
       | DESC
       | LBRACE eexp RBRACE

lopt   ::=
       | LIMIT ALL
       | LIMIT sqlint

ofopt  ::=
       | OFFSET sqlint

sqlint ::= INT
       | LBRACE eexp RBRACE

sqlagg ::= AVG
       | SUM
       | MIN
       | MAX

ffi_mode ::= SYMBOL
         | SYMBOL STRING

ffi_modes ::=
          | ffi_mode ffi_modes

//Tokens
//<INITIAL> \("[^"]+"\)\s+=> (Tokens.\(\S+\).+

UNIT ::= "()"
LPAREN ::= "("
RPAREN ::= ")"
LBRACK ::= "["
RBRACK ::= "]"
LBRACE ::= "{"
RBRACE ::= "}"

KARROW ::= "-->"
ARROW ::= "->"
DKARROW ::= "==>"
DARROW ::= "=>"
PLUSPLUS ::= "++"
MINUSMINUS ::= "--"
MINUSMINUSMINUS ::= "---"
CARET ::= "^"

ANDALSO ::= "&&"
ORELSE ::= "||"

COMPOSE ::= "<<<"
ANDTHEN ::= ">>>"
FWDAPP ::= "<|"
REVAPP ::= "|>"

EQ ::= "="
NE ::= "<>"
LT ::= "<"
GT ::= ">"
LE ::= "<="
GE ::= ">="
COMMA ::= ","
TCOLONWILD ::= ":::_"
TCOLON ::= ":::"
DCOLONWILD ::= "::_"
DCOLON ::= "::"
COLON ::= ":"
DOTDOTDOT ::= "..."
DOT ::= "."
DOLLAR ::= "$"
HASH ::= "#"
UNDERUNDER ::= "__"
UNDER ::= "_"
TWIDDLE ::= "~"
BAR ::= "|"
STAR ::= "*"
LARROW ::= "<-"
SEMI ::= ";"
BANG ::= "!"

PLUS ::= "+"
MINUS ::= "-"
DIVIDE ::= "/"
MOD ::= "%"
AT ::= "@"

CON ::= "con"
LTYPE ::= "type"
DATATYPE ::= "datatype"
OF ::= "of"
VAL ::= "val"
REC ::= "rec"
AND ::= "and"
FUN ::= "fun"
FN ::= "fn"
MAP ::= "map"
CASE ::= "case"
IF ::= "if"
THEN ::= "then"
ELSE ::= "else"

STRUCTURE ::= "structure"
SIGNATURE ::= "signature"
STRUCT ::= "struct"
SIG ::= "sig"
LET ::= "let"
IN ::= "in"
END ::= "end"
FUNCTOR ::= "functor"
WHERE ::= "where"
INCLUDE ::= "include"
OPEN ::= "open"
CONSTRAINT ::= "constraint"
CONSTRAINTS ::= "constraints"
EXPORT ::= "export"
TABLE ::= "table"
SEQUENCE ::= "sequence"
VIEW ::= "view"
INDEX ::= "ensure_index"
CLASS ::= "class"
COOKIE ::= "cookie"
STYLE ::= "style"
TASK ::= "task"
POLICY ::= "policy"
FFI ::= "ffi"

SELECT ::= "SELECT"
DISTINCT ::= "DISTINCT"
FROM ::= "FROM"
AS ::= "AS"
CWHERE ::= "WHERE"
SQL ::= "SQL"
GROUP ::= "GROUP"
ORDER ::= "ORDER"
BY ::= "BY"
HAVING ::= "HAVING"
LIMIT ::= "LIMIT"
OFFSET ::= "OFFSET"
ALL ::= "ALL"
SELECT1 ::= "SELECT1"

JOIN ::= "JOIN"
INNER ::= "INNER"
CROSS ::= "CROSS"
OUTER ::= "OUTER"
LEFT ::= "LEFT"
RIGHT ::= "RIGHT"
FULL ::= "FULL"

UNION ::= "UNION"
INTERSECT ::= "INTERSECT"
EXCEPT ::= "EXCEPT"

TRUE ::= "TRUE"
FALSE ::= "FALSE"
CAND ::= "AND"
OR ::= "OR"
NOT ::= "NOT"

COUNT ::= "COUNT"
AVG ::= "AVG"
SUM ::= "SUM"
MIN ::= "MIN"
MAX ::= "MAX"
RANK ::= "RANK"
PARTITION ::= "PARTITION"
OVER ::= "OVER"

CIF ::= "IF"
CTHEN ::= "THEN"
CELSE ::= "ELSE"

ASC ::= "ASC"
DESC ::= "DESC"
RANDOM ::= "RANDOM"

INSERT ::= "INSERT"
INTO ::= "INTO"
VALUES ::= "VALUES"
UPDATE ::= "UPDATE"
SET ::= "SET"
DELETE ::= "DELETE"
NULL ::= "NULL"
IS ::= "IS"
COALESCE ::= "COALESCE"
LIKE ::= "LIKE"
DISTANCE ::= "<->"

CCONSTRAINT ::= "CONSTRAINT"
UNIQUE ::= "UNIQUE"
CHECK ::= "CHECK"
PRIMARY ::= "PRIMARY"
FOREIGN ::= "FOREIGN"
KEY ::= "KEY"
ON ::= "ON"
NO ::= "NO"
ACTION ::= "ACTION"
RESTRICT ::= "RESTRICT"
CASCADE ::= "CASCADE"
REFERENCES ::= "REFERENCES"

CURRENT_TIMESTAMP ::= "CURRENT_TIMESTAMP"

Script that transformed src/urweb.grm:

auto txt = readfile("urweb/src/urweb.grm");

txt = txt.match("%%%%(.+)");
txt = txt.match("%%%%(.+)");
txt = txt.gsub("%s*%b()", "");
txt = txt.gsub(":", "::=");
txt = txt.gsub("(%w)'", "%12");
print(txt);
mingodad commented 1 year ago

When trying to add this project grammar to https://mingodad.github.io/parsertl-playground/playground/ I found that it contains several conflicts:

bison-nb -v urweb.y
urweb.y: warning: 76 shift/reduce conflicts [-Wconflicts-sr]
urweb.y: warning: 23 reduce/reduce conflicts [-Wconflicts-rr]