ωBNF — super awesome parser engine

ωBNF is pronounced "omega BNF".

Grammar Syntax Guide

An ωBNF grammar file consists of an unordered list of rules (called productions) or comments.


A Comment can be either C++-style // This is a comment to the end of the line or C-style /* This is a comment which may span multiple lines */


A rule is defined in terms of terms, or terminals in the form of NAME -> TERM+ ;:



Terms can be grouped in various ways to build up rules.

Further Details

Delimited Repeater

This is the definition of the delimited repeater op=/{<:|:>?} opt_leading=","? named opt_trailing=","?.

Parser Configuration Commands (pragmas)

Some special commands are defined in the grammar to control the way the parser executes.

.import relative_filename Allows the wbnf file to merge the grammar of the imported filename into the current grammar (equivalent to #include in c)

.macro Name(args) { term } Allows the use of macros to minimise repetition in the grammar (see below)


Macros can be used when a common pattern is required through the grammar which cant easily be converted to a rule.

Macros are conceptually the same as C-style #define's, except rather than simply substituting text, a full expression can be used.

We will explain how to use macros by implementing the equivalent of the delimited repeater. First a macro is defined .macro Delim(term, sep) { term (sep term)* }, and used %!Delim(a, "<"? ":" ">"? )

This would expand to a (("<"? ":" ">"?) a)* which is equivalent of a:("<"? ":" ">"?)

Magic rules

Rules prefixed by a . are special rules governing the parser's overall behaviour. The following rules are recognised:

.wrapRE -> /{some () regex}

This rule instructs the parser to wrap every regular expression with this one. The actual regex is inserted into the ().


Useful recipes

Below are a collection of helpful rules which can be dropped into your grammar.

The ultimate example: ωBNF is self-hosting!

The ωBNF syntax described above is itself implemented in ωBNF. The following grammar is auto-generated from the formal grammar used in the ωBNF parsing engine.

// Non-terminals
grammar -> stmt+;
stmt    -> COMMENT | prod | pragma;
prod    -> IDENT "->" term+ ";";
term    -> (@ ("{" grammar "}")? ):op=">"
         > @:op="|"
         > @+
         > named quant*;
named   -> (IDENT op="=")? atom;
quant   -> op=[?*+]
         | "{" min=INT? "," max=INT? "}"
         | op=/{<:|:>?} opt_leading=","? named opt_trailing=","?;
atom    -> IDENT
         | STR
         | RE
         | macrocall
         | ExtRef=("%%" IDENT)
         | REF
         | "(?=" lookahead=term ")"
         | "(" term ")"
         | "(" ")";

macrocall   -> "%!" name=IDENT "(" term:","? ")";
REF         -> "%" IDENT ("=" default=STR)?;

// Terminals
COMMENT -> /{ //.*$
            | (?s: /\* (?: [^*] | \*+[^*/] ) \*/ )
IDENT   -> /{@|\.?[A-Za-z_]\w*};
INT     -> \d+;
STR     -> /{ " (?: \\. | [^\\"] )* "
            | ' (?: \\. | [^\\'] )* '
            | ` (?: ``  | [^`]   )* `
RE      -> /{
                 | { (?: (?: \d+(?:,\d*)? | ,\d+ ) \} )?
                 | \[ (?: \\. | \[:^?[a-z]+:\] | [^\]] )+ ]
                 | [^\\{\}]
           | (?:
                 \[ (?: \\. | \[:^?[a-z]+:\] | [^\]] )+ ]
               | \\[pP](?:[a-z]|\{[a-zA-Z_]+\})
               | \\[a-zA-Z]
               | [.^$]
               )(?: (?:[+*?]|\{\d+,?\d?\}) \?? )?

// Special
pragma  -> import | macrodef {
                import   -> ".import" path=((".."|"."|[a-zA-Z0-9.:]+):,"/") ";"?;
                macrodef -> ".macro" name=IDENT "(" args=IDENT:","? ")" "{" term "}" ";"?;

.wrapRE -> /{\s*()\s*};