pikelang / Pike

Pike is a dynamic programming language with a syntax similar to Java and C. It is simple to learn, does not require long compilation passes and has powerful built-in data types allowing simple and really fast data manipulation.
http://pike.lysator.liu.se/
Other
194 stars 34 forks source link

Grammar railroad diagram #39

Closed mingodad closed 7 months ago

mingodad commented 2 years ago

Using this online tool https://www.bottlecaps.de/convert/ to convert the Pike grammar to a format understood by https://www.bottlecaps.de/rr/ui and manually adding the lex tokens we can have a navigable railroad diagram (https://en.wikipedia.org/wiki/Syntax_diagram).

Copy the EBNF shown bellow on https://www.bottlecaps.de/rr/ui in the tab Edit Grammar then switch to the tab View Diagram.

/* converted on Tue Aug 23, 2022, 11:11 (UTC+02) by bison-to-w3c v0.59 which is Copyright (c) 2011-2022 by Gunther Rademacher <grd@gmx.net> */

all      ::= program TOK_LEX_EOF?
program  ::= ( def | ';' )*
real_string_or_identifier
         ::= TOK_IDENTIFIER
           | TOK_STRING ( '+'? TOK_STRING )*
optional_rename_inherit
         ::= ( ':' ( real_string_or_identifier | bad_identifier | error ) )?
low_program_ref
         ::= safe_expr0
inherit_ref
         ::= low_program_ref
inheritance
         ::= modifiers TOK_INHERIT inherit_ref ( optional_rename_inherit ';' | error ( ';' | TOK_LEX_EOF | '}' ) )
import   ::= TOK_IMPORT constant_expr ';'
constant_name
         ::= ( TOK_IDENTIFIER | bad_identifier | error ) '=' safe_expr0
constant ::= modifiers TOK_CONSTANT ( constant_name ( ',' constant_name )* ';' | error ( ';' | TOK_LEX_EOF | '}' ) )
block_or_semi
         ::= block
           | ';'
           | TOK_LEX_EOF
           | error
safe_apply_with_line_info
         ::= TOK_SAFE_APPLY
open_paren_or_safe_apply_with_line_info
         ::= '('
           | safe_apply_with_line_info
close_paren_or_missing
         ::= ')'?
close_brace_or_missing
         ::= '}'?
close_brace_or_eof
         ::= '}'
           | TOK_LEX_EOF
close_bracket_or_missing
         ::= ']'?
start_function
         ::=
optional_constant
         ::= TOK_CONSTANT?
def      ::= modifiers ( optional_attributes simple_type optional_constant ( TOK_IDENTIFIER start_function ( '(' arguments close_paren_or_missing block_or_semi | error ) | bad_identifier '(' arguments ')' block_or_semi | new_name ( ',' new_name )* ';' ) | named_class | enum | '{' program close_brace_or_eof )
           | inheritance
           | import
           | constant
           | ( annotation | '@' TOK_CONSTANT ) ';'
           | typedef
           | static_assertion expected_semicolon
           | error ( TOK_LEX_EOF | ';' | '}' )
static_assertion
         ::= TOK_STATIC_ASSERT '(' expr0 ',' expr0 ')'
optional_dot_dot_dot
         ::= ( TOK_DOT_DOT_DOT | TOK_DOT_DOT )?
optional_identifier
         ::= ( TOK_IDENTIFIER | bad_identifier )?
new_arg_name
         ::= full_type optional_dot_dot_dot optional_identifier
func_args
         ::= '(' arguments close_paren_or_missing
arguments
         ::= ( new_arg_name ( ( ',' | ':' ) new_arg_name )* )? optional_comma
modifier ::= TOK_FINAL_ID
           | TOK_STATIC
           | TOK_EXTERN
           | TOK_OPTIONAL
           | TOK_PRIVATE
           | TOK_LOCAL_ID
           | TOK_PUBLIC
           | TOK_PROTECTED
           | TOK_INLINE
           | TOK_VARIANT
           | TOK_WEAK
           | TOK_CONTINUE
           | TOK_UNUSED
magic_identifiers1
         ::= TOK_FINAL_ID
           | TOK_STATIC
           | TOK_EXTERN
           | TOK_PRIVATE
           | TOK_LOCAL_ID
           | TOK_PUBLIC
           | TOK_PROTECTED
           | TOK_INLINE
           | TOK_OPTIONAL
           | TOK_VARIANT
           | TOK_WEAK
           | TOK_UNUSED
           | TOK_STATIC_ASSERT
magic_identifiers2
         ::= TOK_VOID_ID
           | TOK_MIXED_ID
           | TOK_ARRAY_ID
           | TOK_ATTRIBUTE_ID
           | TOK_DEPRECATED_ID
           | TOK_MAPPING_ID
           | TOK_MULTISET_ID
           | TOK_OBJECT_ID
           | TOK_FUNCTION_ID
           | TOK_FUNCTION_NAME
           | TOK_PROGRAM_ID
           | TOK_STRING_ID
           | TOK_FLOAT_ID
           | TOK_INT_ID
           | TOK_ENUM
           | TOK_TYPEDEF
           | TOK_UNKNOWN
magic_identifiers3
         ::= TOK_IF
           | TOK_DO
           | TOK_FOR
           | TOK_WHILE
           | TOK_ELSE
           | TOK_FOREACH
           | TOK_CATCH
           | TOK_GAUGE
           | TOK_CLASS
           | TOK_BREAK
           | TOK_CASE
           | TOK_CONSTANT
           | TOK_CONTINUE
           | TOK_DEFAULT
           | TOK_IMPORT
           | TOK_INHERIT
           | TOK_LAMBDA
           | TOK_PREDEF
           | TOK_RETURN
           | TOK_SSCANF
           | TOK_SWITCH
           | TOK_TYPEOF
           | TOK_GLOBAL
magic_identifiers
         ::= magic_identifiers1
           | magic_identifiers2
           | magic_identifiers3
magic_identifier
         ::= TOK_IDENTIFIER
           | TOK_RESERVED
           | magic_identifiers
annotation
         ::= '@' constant_expr
modifiers
         ::= ( annotation ':' )* modifier*
attribute
         ::= TOK_ATTRIBUTE_ID '(' string_constant optional_comma ')'
           | TOK_DEPRECATED_ID ( '(' ')' )?
optional_attributes
         ::= attribute*
cast     ::= '(' type ')'
soft_cast
         ::= '[' type ']'
type2    ::= type
           | identifier_type
simple_type
         ::= full_type
simple_type2
         ::= type2
full_type
         ::= type3 ( '|' type3 )*
type     ::= basic_type ( '|' type3 )*
type3    ::= basic_type
           | identifier_type
basic_type
         ::= TOK_FLOAT_ID
           | TOK_VOID_ID
           | TOK_MIXED_ID
           | TOK_UNKNOWN
           | TOK_AUTO_ID
           | TOK_STRING_ID opt_string_width
           | TOK_INT_ID opt_int_range
           | TOK_MAPPING_ID opt_mapping_type
           | TOK_FUNCTION_ID opt_function_type
           | ( TOK_OBJECT_ID | TOK_PROGRAM_ID ) opt_program_type
           | TOK_ARRAY_ID opt_array_type
           | TOK_MULTISET_ID opt_multiset_type
           | TOK_ATTRIBUTE_ID ( '(' string_constant ( ',' full_type | error ) ')' | error )
           | TOK_DEPRECATED_ID '(' ( full_type | error ) ')'
identifier_type
         ::= idents
           | typeof
number   ::= '-'? TOK_NUMBER
number_or_maxint
         ::= number?
number_or_minint
         ::= number?
expected_dot_dot
         ::= TOK_DOT_DOT
           | TOK_DOT_DOT_DOT
safe_int_range_type_low
         ::= TOK_BITS
           | number_or_minint expected_dot_dot number_or_maxint
           | number
           | error
safe_int_range_type
         ::= safe_int_range_type_low ( '|' safe_int_range_type_low )*
opt_int_range
         ::= ( '(' safe_int_range_type ')' )?
opt_string_width
         ::= opt_int_range
           | '(' ( safe_int_range_type ':' safe_int_range_type? | ':' safe_int_range_type ) ')'
opt_program_type
         ::= ( '(' ( full_type | string_constant | error ) ')' )?
opt_function_type
         ::= ( '(' function_type_list optional_dot_dot_dot ':' full_type ')' )?
function_type_list
         ::= ( full_type ( ',' full_type )* )? optional_comma
opt_multiset_type
         ::= ( '(' full_type ')' )?
opt_array_type
         ::= ( '(' ( ':'? full_type | safe_int_range_type ':' full_type? ) ')' )?
opt_mapping_type
         ::= ( '(' full_type ':' full_type ')' )?
new_name ::= TOK_IDENTIFIER ( '=' ( expr0 | error | TOK_LEX_EOF ) )?
           | bad_identifier ( '=' expr0 )?
new_local_name
         ::= TOK_IDENTIFIER ( '=' ( expr0 | error | TOK_LEX_EOF ) )?
           | bad_identifier ( '=' expr0 )?
line_number_info
         ::=
block    ::= '{' line_number_info statements end_block
end_block
         ::= '}'
           | TOK_LEX_EOF
failsafe_block
         ::= block
           | error
           | TOK_LEX_EOF
constant_expr
         ::= safe_expr0
local_constant_name
         ::= ( TOK_IDENTIFIER | bad_identifier | error ) '=' safe_expr0
local_constant
         ::= TOK_CONSTANT ( local_constant_name ( ',' local_constant_name )* ';' | error ( ';' | TOK_LEX_EOF | '}' ) )
statements
         ::= statement*
statement_with_semicolon
         ::= unused2 expected_semicolon
normal_label_statement
         ::= statement_with_semicolon
           | import
           | cond
           | return
           | local_constant
           | block
           | ( break | continue ) expected_semicolon
           | error ( ';' | TOK_LEX_EOF | '}' )
           | ';'
statement
         ::= normal_label_statement
           | while
           | do
           | for
           | foreach
           | switch
           | case
           | default
           | labeled_statement
           | simple_type2 local_function
           | TOK_CONTINUE simple_type2 local_generator
           | implicit_modifiers named_class
labeled_statement
         ::= TOK_IDENTIFIER ':' statement
optional_label
         ::= TOK_IDENTIFIER?
break    ::= TOK_BREAK optional_label
default  ::= TOK_DEFAULT ':'?
continue ::= TOK_CONTINUE optional_label
start_lambda
         ::=
implicit_identifier
         ::=
lambda   ::= TOK_LAMBDA line_number_info implicit_identifier start_lambda ( func_args failsafe_block | error )
local_function
         ::= TOK_IDENTIFIER start_function ( func_args failsafe_block | error )
local_generator
         ::= TOK_IDENTIFIER start_function ( func_args failsafe_block | error )
create_arg
         ::= modifiers simple_type ( optional_dot_dot_dot TOK_IDENTIFIER | bad_identifier )
create_arguments
         ::= ( create_arg ( ( ',' | ':' ) create_arg )* )? optional_comma
optional_create_arguments
         ::= ( '(' create_arguments close_paren_or_missing )?
failsafe_program
         ::= '{' program end_block
           | error
           | TOK_LEX_EOF
anon_class
         ::= TOK_CLASS line_number_info optional_create_arguments failsafe_program
named_class
         ::= TOK_CLASS line_number_info simple_identifier optional_create_arguments failsafe_program
simple_identifier
         ::= TOK_IDENTIFIER
           | bad_identifier
enum_value
         ::= ( '=' safe_expr0 )?
enum_def ::= ( simple_identifier enum_value )?
propagated_enum_value
         ::=
enum     ::= TOK_ENUM optional_identifier '{' ( enum_def | error ) ( ',' propagated_enum_value enum_def )* end_block
typedef  ::= modifiers TOK_TYPEDEF full_type simple_identifier ';'
save_locals
         ::=
save_block_level
         ::=
cond     ::= TOK_IF save_block_level save_locals line_number_info '(' safe_comma_expr end_cond statement optional_else_part
end_cond ::= ')'
           | '}'
           | TOK_LEX_EOF
optional_else_part
         ::= ( TOK_ELSE statement )?
safe_lvalue
         ::= lvalue
           | error
safe_expr0
         ::= expr0
           | TOK_LEX_EOF
           | error
foreach_optional_lvalue
         ::= safe_lvalue?
foreach_lvalues
         ::= ',' safe_lvalue
           | ';' foreach_optional_lvalue ';' foreach_optional_lvalue
foreach  ::= TOK_FOREACH save_block_level save_locals line_number_info '(' expr0 foreach_lvalues end_cond statement
do       ::= TOK_DO line_number_info statement ( TOK_WHILE ( '(' safe_comma_expr end_cond expected_semicolon | TOK_LEX_EOF ) | TOK_LEX_EOF )
expected_semicolon
         ::= ';'
           | TOK_LEX_EOF
for      ::= TOK_FOR save_block_level save_locals line_number_info '(' unused expected_semicolon for_expr expected_semicolon unused end_cond statement
while    ::= TOK_WHILE save_block_level save_locals line_number_info '(' safe_comma_expr end_cond statement
for_expr ::= safe_comma_expr?
switch   ::= TOK_SWITCH save_block_level save_locals line_number_info '(' safe_comma_expr end_cond statement
case     ::= TOK_CASE ( safe_comma_expr ( expected_dot_dot optional_comma_expr )? | expected_dot_dot safe_comma_expr ) expected_colon
expected_colon
         ::= ':'
           | ';'
           | '}'
           | TOK_LEX_EOF
optional_continue
         ::= ( TOK_CONTINUE | TOK_BREAK )?
return   ::= optional_continue TOK_RETURN safe_comma_expr? expected_semicolon
unused   ::= safe_comma_expr?
unused2  ::= comma_expr
optional_comma_expr
         ::= safe_comma_expr?
safe_comma_expr
         ::= comma_expr
           | error
comma_expr
         ::= comma_expr2
           | simple_type2 new_local_name ( ',' new_local_name )*
comma_expr2
         ::= expr0 ( ',' expr0 )*
splice_expr
         ::= '@'? expr0
expr0    ::= ( ( expr4 | '[' low_lvalue_list ']' ) assign )* ( expr01 | ( expr4 assign | '[' low_lvalue_list ']' ) error )
expr01   ::= expr1 ( '?' expr01 ':' expr01 )?
assign   ::= '='
           | TOK_AND_EQ
           | TOK_OR_EQ
           | TOK_XOR_EQ
           | TOK_LSH_EQ
           | TOK_RSH_EQ
           | TOK_ADD_EQ
           | TOK_SUB_EQ
           | TOK_MULT_EQ
           | TOK_POW_EQ
           | TOK_MOD_EQ
           | TOK_DIV_EQ
           | TOK_ATOMIC_GET_SET
optional_comma
         ::= ','?
expr_list
         ::= ( splice_expr ( ',' splice_expr )* optional_comma )?
m_expr_list
         ::= ( assoc_pair ( ',' ( assoc_pair | error ) )* optional_comma )?
assoc_pair
         ::= expr0 expected_colon ( expr0 | error )
expr1    ::= ( cast | soft_cast | TOK_NOT | '~' | '-' )* ( expr3 | ( TOK_INC | TOK_DEC ) expr4 )
           | expr1 ( ( TOK_LOR | TOK_LAND | '|' | '^' | '&' | TOK_EQ | TOK_NE | '>' | TOK_GE | '<' | TOK_LE | TOK_LSH | TOK_RSH | '+' | '-' | '*' | '%' | '/' ) ( expr1 | error ) | TOK_POW expr1 )
expr3    ::= expr4 ( TOK_INC | TOK_DEC )?
optional_block
         ::= ( '{' line_number_info start_lambda statements end_block )?
apply    ::= expr4 ( ( '(' | safe_apply_with_line_info ) expr_list ')' optional_block | open_paren_or_safe_apply_with_line_info error ( ')' optional_block | TOK_LEX_EOF | ';' | '}' ) )
implicit_modifiers
         ::=
expr4    ::= idents
           | expr5 ( '.' line_number_info TOK_IDENTIFIER )?
           | bad_expr_ident
expr5    ::= literal_expr
           | catch
           | gauge
           | typeof
           | sscanf
           | static_assertion
           | lambda
           | implicit_modifiers ( anon_class | enum )
           | apply
           | expr4 ( '[' ( ( '*' | expr0 | range_bound expected_dot_dot range_bound ) ']' | error ( ']' | TOK_LEX_EOF | ';' | '}' | ')' ) ) | TOK_SAFE_START_INDEX line_number_info ( expr0 | range_bound expected_dot_dot range_bound ) ']' | TOK_ARROW line_number_info ( magic_identifier | error ) | TOK_SAFE_INDEX line_number_info TOK_IDENTIFIER )
           | '(' ( comma_expr2 ')' | error ( ')' | TOK_LEX_EOF | ';' | '}' ) )
literal_expr
         ::= string
           | TOK_NUMBER
           | TOK_FLOAT
           | '(' ( '{' expr_list close_brace_or_missing | '[' m_expr_list close_bracket_or_missing ) ')'
           | TOK_MULTISET_START line_number_info ( expr_list ( TOK_MULTISET_END | ')' ) | error ( TOK_MULTISET_END | ')' | TOK_LEX_EOF | ';' | '}' ) )
idents   ::= ( low_idents | qualified_ident ) ( '.' ( TOK_IDENTIFIER | bad_identifier ) )*
string_or_identifier
         ::= TOK_IDENTIFIER
           | string
inherit_specifier
         ::= ( string_or_identifier | TOK_LOCAL_ID | TOK_GLOBAL ) TOK_COLON_COLON ( ( TOK_LOCAL_ID | TOK_IDENTIFIER | bad_inherit ) TOK_COLON_COLON )*
low_idents
         ::= ( TOK_GLOBAL? '.' )? TOK_IDENTIFIER
           | TOK_RESERVED
qualified_ident
         ::= ( TOK_PREDEF | TOK_VERSION )? TOK_COLON_COLON ( TOK_IDENTIFIER | bad_identifier )
           | inherit_specifier ( TOK_IDENTIFIER | bad_identifier | error )
range_bound
         ::= ( '<'? ( comma_expr | TOK_LEX_EOF ) )?
gauge    ::= TOK_GAUGE catch_arg
typeof   ::= TOK_TYPEOF '(' ( expr0 ')' | error ( ')' | '}' | TOK_LEX_EOF | ';' ) )
catch_arg
         ::= '(' ( comma_expr ')' | error ( ')' | TOK_LEX_EOF | '}' | ';' ) )
           | block
           | error
catch    ::= TOK_CATCH catch_arg
sscanf   ::= TOK_SSCANF '(' ( expr0 ( ',' expr0 ( lvalue_list ')' | error ( ')' | TOK_LEX_EOF | '}' | ';' ) ) | error ( ')' | TOK_LEX_EOF | '}' | ';' ) ) | error ( ')' | TOK_LEX_EOF | '}' | ';' ) )
lvalue   ::= expr4
           | '[' low_lvalue_list ']'
           | type2 TOK_IDENTIFIER
low_lvalue_list
         ::= lvalue lvalue_list
lvalue_list
         ::= ( ',' lvalue )*
string_segment
         ::= TOK_STRING
           | TOK_FUNCTION_NAME
string   ::= string_segment+
string_constant
         ::= string ( '+' string )*
bad_identifier
         ::= bad_inherit
           | TOK_LOCAL_ID
bad_inherit
         ::= bad_expr_ident
           | TOK_ARRAY_ID
           | TOK_ATTRIBUTE_ID
           | TOK_BREAK
           | TOK_CASE
           | TOK_CATCH
           | TOK_CLASS
           | TOK_CONTINUE
           | TOK_DEFAULT
           | TOK_DEPRECATED_ID
           | TOK_DO
           | TOK_ENUM
           | TOK_FLOAT_ID
           | TOK_FOR
           | TOK_FOREACH
           | TOK_FUNCTION_ID
           | TOK_FUNCTION_NAME
           | TOK_GAUGE
           | TOK_IF
           | TOK_IMPORT
           | TOK_INT_ID
           | TOK_LAMBDA
           | TOK_MAPPING_ID
           | TOK_MIXED_ID
           | TOK_MULTISET_ID
           | TOK_OBJECT_ID
           | TOK_PROGRAM_ID
           | TOK_RETURN
           | TOK_SSCANF
           | TOK_STRING_ID
           | TOK_SWITCH
           | TOK_TYPEDEF
           | TOK_TYPEOF
           | TOK_UNKNOWN
           | TOK_VOID_ID
           | TOK_RESERVED
bad_expr_ident
         ::= TOK_INLINE
           | TOK_PREDEF
           | TOK_PRIVATE
           | TOK_PROTECTED
           | TOK_PUBLIC
           | TOK_OPTIONAL
           | TOK_VARIANT
           | TOK_WEAK
           | TOK_STATIC
           | TOK_EXTERN
           | TOK_FINAL_ID
           | TOK_ELSE
           | TOK_INHERIT

// Tokens
//if(ISWORD(\("[^"]+"\)))\s+return \(\S[^;]+\); --> \2 ::= \1

TOK_ARRAY_ID ::= "array"
TOK_AUTO_ID ::= "auto"
TOK_BREAK ::= "break"
TOK_CASE ::= "case"
TOK_CATCH ::= "catch"
TOK_CLASS ::= "class"
TOK_CONSTANT ::= "constant"
TOK_CONTINUE ::= "continue"
TOK_DEFAULT ::= "default"
TOK_DO ::= "do"
TOK_ELSE ::= "else"
TOK_ENUM ::= "enum"
TOK_EXTERN ::= "extern"
TOK_FINAL_ID ::= "final"
TOK_FLOAT_ID ::= "float"
TOK_FOR ::= "for"
TOK_FOREACH ::= "foreach"
TOK_FUNCTION_ID ::= "function"
TOK_GAUGE ::= "gauge"
TOK_GLOBAL ::= "global"
TOK_IF ::= "if"
TOK_IMPORT ::= "import"
TOK_INT_ID ::= "int"
TOK_INHERIT ::= "inherit"
TOK_INLINE ::= "inline"
TOK_LAMBDA ::= "lambda"
TOK_LOCAL_ID ::= "local"
TOK_MAPPING_ID ::= "mapping"
TOK_MIXED_ID ::= "mixed"
TOK_MULTISET_ID ::= "multiset"
TOK_OBJECT_ID ::= "object"
TOK_OPTIONAL ::= "optional"
TOK_PROGRAM_ID ::= "program"
TOK_PREDEF ::= "predef"
TOK_PRIVATE ::= "private"
TOK_PROTECTED ::= "protected"
TOK_PUBLIC ::= "public"
TOK_RETURN ::= "return"
TOK_SSCANF ::= "sscanf"
TOK_STRING_ID ::= "string"
TOK_STATIC ::= "static"
TOK_SWITCH ::= "switch"
TOK_TYPEDEF ::= "typedef"
TOK_TYPEOF ::= "typeof"
TOK_VARIANT ::= "variant"
TOK_VOID_ID ::= "void"
TOK_WHILE ::= "while"
TOK_STATIC_ASSERT ::= "_Static_assert"
TOK_ATTRIBUTE_ID ::= "__attribute__"
TOK_DEPRECATED_ID ::= "__deprecated__"
TOK_FUNCTION_NAME ::= "__func__"
TOK_WEAK ::= "__weak__"
TOK_UNUSED ::= "__unused__"
TOK_UNKNOWN ::= "__unknown__"
mingodad commented 11 months ago

I've just added src/language.yacc to https://mingodad.github.io/parsertl-playground/playground/ an Yacc/Lex compatible online editor/tester (select Pike-lang parser from Examples then click Parse to see a parser tree for the content in Input source).

I hope it can help develop/debug/test/document this project grammar !

Any feedback is welcome !