zksecurity / noname

Noname: a programming language to write zkapps
https://zksecurity.github.io/noname/
161 stars 35 forks source link

Feat: Create grammar file in BNF format #127

Open vuvoth opened 1 month ago

vuvoth commented 1 month ago

Motivation

We are using handwritten parse in a noname project. We should create specs about grammar in BNF form. This helps us to agree with current grammar and makes it easier to follow develop and maintain syntax.

More context

mimoo commented 1 month ago

I learned about https://pest.rs/ today

mimoo commented 1 month ago

I also asked bard to produce a grammar file considering an example X) maybe that's a good way

WHITESPACE = _{ " " | "\t" | "\n" | "\r" }

// Constants and basic types
identifier = @{ ASCII_ALPHA ~ (ASCII_ALPHANUMERIC | "_")* }
number = @{ ASCII_DIGIT+ }

// Keywords
KW_CONST = _{ "const" }
KW_STRUCT = _{ "struct" }
KW_FN = _{ "fn" }
KW_FOR = _{ "for" }
KW_IN = _{ "in" }
KW_LET = _{ "let" }
KW_MUT = _{ "mut" }
KW_ASSERT = _{ "assert" }
KW_RETURN = _{ "return" }
KW_MAIN = _{ "main" }
KW_PUB = _{ "pub" }

// Punctuation and operators
COLON = _{ ":" }
SEMICOLON = _{ ";" }
EQUALS = _{ "=" }
DOT = _{ "." }
COMMA = _{ "," }
L_PAREN = _{ "(" }
R_PAREN = _{ ")" }
L_BRACE = _{ "{" }
R_BRACE = _{ "}" }
L_BRACKET = _{ "[" }
R_BRACKET = _{ "]" }
PLUS = _{ "+" }
MINUS = _{ "-" }
ASTERISK = _{ "*" }
DOUBLE_AMPERSAND = _{ "&&" }
DOUBLE_PIPE = _{ "||" }

// Literals and constants
empty = _{ "0" }
player1 = _{ "1" }
player2 = _{ "2" }
sudoku_size = _{ "81" }

// Structure definition
struct_def = {
    KW_STRUCT ~ identifier ~ L_BRACE
        inner_field ~ // Assuming only one field for now
    R_BRACE
}

inner_field = {
    identifier ~ COLON ~ L_BRACKET ~ identifier ~ SEMICOLON ~ number ~ R_BRACKET ~ COMMA
}

// Function definition
fn_def = {
    KW_FN ~ identifier ~ DOT ~ identifier ~ L_PAREN ~ self_arg ~ fn_args ~ R_PAREN ~ optional_return_type ~ block
}

self_arg = {
    identifier ~ COLON ~ identifier
}

fn_args = {
    (identifier ~ COLON ~ identifier ~ COMMA)* ~ identifier ~ COLON ~ identifier
}

optional_return_type = {
    (MINUS ~ ">" ~ identifier)?
}

// Function call
fn_call = {
    identifier ~ DOT ~ identifier ~ L_PAREN ~ fn_call_args ~ R_PAREN
}

fn_call_args = {
    (identifier ~ COMMA)* ~ identifier
}

// Block (function body)
block = { L_BRACE ~ statements ~ R_BRACE }

statements = { statement* }

statement = _{
    const_decl
  | struct_def
  | fn_def
  | for_loop
  | let_decl
  | assignment
  | assert_stmt
  | return_stmt
  | fn_call
  | main_fn
}

const_decl = { KW_CONST ~ identifier ~ EQUALS ~ number ~ SEMICOLON }

for_loop = {
    KW_FOR ~ identifier ~ KW_IN ~ number ~ DOT ~ DOT ~ number ~ block
}

let_decl = { KW_LET ~ (KW_MUT)? ~ identifier ~ EQUALS ~ expr ~ SEMICOLON }

assignment = { identifier ~ EQUALS ~ expr ~ SEMICOLON }

assert_stmt = { KW_ASSERT ~ L_PAREN ~ expr ~ R_PAREN ~ SEMICOLON }

return_stmt = { KW_RETURN ~ expr ~ SEMICOLON }

main_fn = { KW_MAIN ~ L_PAREN ~ KW_PUB ~ identifier ~ COLON ~ identifier ~ COMMA ~ identifier ~ COLON ~ identifier ~ R_PAREN ~ block }

expr = _{
    identifier
  | number
  | fn_call
  | expr ~ (PLUS | MINUS | ASTERISK) ~ expr
  | expr ~ (DOUBLE_AMPERSAND | DOUBLE_PIPE) ~ expr
  | L_PAREN ~ expr ~ R_PAREN
}