aardappel / lobster

The Lobster Programming Language
http://strlen.com/lobster
2.24k stars 121 forks source link

Grammar railroad diagram #221

Closed mingodad closed 1 year ago

mingodad commented 1 year ago

Looking at https://aardappel.github.io/lobster/language_reference.html if the grammar is described with an EBNF understood by https://www.bottlecaps.de/rr/ui we could have a nice navigable railroad diagram (and maybe even an alternative parser with https://www.bottlecaps.de/rex/ ).

Copy and paste the EBNF (manually incomplete translated) shown bellow on https://www.bottlecaps.de/rr/ui on the tab Edit Grammar the click on the tab View Diagram to see/download a navigable railroad diagram.

program ::= stats end_of_file

stats ::= topexp+ linefeed

topexp ::= "namespace" ident | "import" "from"? ( string_constant | ( ident+ '.' ) ) | "private"? ( functiondef | class | vardef | enumdef ) | expstat | attrdef

class ::= ( "class" | "struct" ) ident ( '=' ident specializers | generics? ':' ( ident specializers? )? indent class_member (',' class_member)* indlist_end )
class_member ::=  ident ( ':' type )? ( '=' exp )? | functiondef

specializers ::= '<' type  (',' type)* '>'

generics ::= '<' ident (',' ident)* '>'

vardef ::= ( "var" | "let" ) ident (',' ident)* '=' opexp

enumdef ::= ( "enum" | "enum_flags" ) ident ':' indent enum_elm (',' enum_elm)* indlist_end
enum_elm ::= ident ( '=' integer_constant )?

functiondef ::= "def" ident generics functionargsbody

functionargsbody ::= '(' args ')' ':' body

block ::= args? ':' body | functionargsbody

args ::= ( one_arg (',' one_arg)* )?

one_arg ::= ident ( ( ':' | '::' ) type )?

body ::= ( expstat | indent stats dedent )

type ::= "int" | "float" | "string" | '[' type ']'| "resource" '<' ident '>' | "void" | ident

call ::= specializers ( ( exp (',' exp)* )? ) ( block ( "fn" block "…" )? )?

expstat ::= ( exp  (';' exp)* ) | "return" ( ( opexp (',' opexp)* )? ) ( "from" ( program | ident ) )?

exp ::= opexp ( ( '=' | "+=" | "-=" | "*=" | "/=" | "%=" ) exp )?

opexp ::= unary ( ('*' | '/' | '%') | ('+' | '-') | ('<' | '>' | ">=" | "<=") | ("==" | "!=") | ('&' | '|' | "and" | "or" | '^' | "<<" | ">>")) unary

unary ::= ( '-' | "++" | "--" | '~' | "not" ) unary | deref

deref ::= factor ( exp? | '.' ident call? | "->" ident | "++" | "--" | "is" type )?

factor ::= constant | '(' exp ')' | constructor | "fn" functionargsbody | ident call?

constructor ::= ( '[' exp (',' exp)* ']' )? ( "::" type )? | ident '{' ( exp (',' exp)* )? '}'

constant ::= numeric_constant | string_constant | character_constant | "nil" ( "::" type )?

attrdef ::= attribute ident ( '=' ( string_constant | numeric_constant | ident ) )?

//indlist(e) ::= indent list(e) linefeed? dedent linefeed
indlist_end ::= linefeed? dedent linefeed

//list(e) ::= e (',' e)*
aardappel commented 1 year ago

Are you suggesting we include a railroad diagram as part of the documentation? that would certainly be neat. I am currently using pandoc to generate the docs, so that would mean the grammar would be a piece of include html that doesn't get automatically updated, which is less than ideal..

An automatically derived parser is likely more challenging given all the special parsing around indents.

First, the above EBNF is not a correct translation of my use of .... Something of the form r = a ... b in EBNF would have to be written r ::= a (b r)?, which forces the use of an extra rule r, which I always found annoying. Also, the ? is easy to miss and always at the end, I like [] much better.

[ a b ... ] would become (a b)*, you made it into (a b "...")?

The use of || would have to be factored out, it is incorrect in the EBNF.

You don't use ' or " consistently.

That said, there may be some value using a standard notation :)

aardappel commented 1 year ago

Seems like so far we are still maintaining the grammar in markdown..