marcelog / ex_abnf

Parser for ABNF Grammars
Apache License 2.0
61 stars 12 forks source link

Cannot parse toml grammar #9

Open niahoo opened 8 years ago

niahoo commented 8 years ago

Hello, I would like to parse this file : https://github.com/toml-lang/toml/blob/abnf/toml.abnf

It seems correct, but ex_abnf fails. Do you know why ?

Thank you !

** (throw) {:incomplete_parsing, 'expression = (\r\n  ws /\r\n  ws comment /\r\n  ws keyval ws [ comment ] /\r\n  ws table ws [ comment ]\r\n)\r\n\r\n;; Newline\r\n\r\nnewline = (\r\n  %x0A /              ; LF\r\n  %x0D.0A             ; CRLF\r\n)\r\n\r\nnewlines = 1*newline\r\n\r\n;; Whitespace\r\n\r\nws = *(\r\n  %x20 /              ; Space\r\n  %x09                ; Horizontal tab\r\n)\r\n\r\n;; Comment\r\n\r\ncomment-start-symbol = %x23 ; #\r\nnon-eol = %x09 / %x20-10FFFF\r\ncomment = comment-start-symbol *non-eol\r\n\r\n;; Key-Value pairs\r\n\r\nkeyval-sep = ws %x3D ws ; =\r\nkeyval = key keyval-sep val\r\n\r\nkey = unquoted-key / quoted-key\r\nunquoted-key = 1*( ALPHA / DIGIT / %x2D / %x5F ) ; A-Z / a-z / 0-9 / - / _\r\nquoted-key = quotation-mark 1*basic-char quotation-mark ; See Basic Strings\r\n\r\nval = integer / float / string / boolean / date-time / array / inline-table\r\n\r\n;; Table\r\n\r\ntable = std-table / array-table\r\n\r\n;; Standard Table\r\n\r\nstd-table-open  = %x5B ws     ; [ Left square bracket\r\nstd-table-close = ws %x5D     ; ] Right square bracket\r\ntable-key-sep   = ws %x2E ws  ; . Period\r\n\r\nstd-table = std-table-open key *( table-key-sep key) std-table-close\r\n\r\n;; Array Table\r\n\r\narray-table-open  = %x5B.5B ws  ; [[ Double left square bracket\r\narray-table-close = ws %x5D.5D  ; ]] Double right quare bracket\r\n\r\narray-table = array-table-open key *( table-key-sep key) array-table-close\r\n\r\n;; Integer\r\n\r\ninteger = [ minus / plus ] int\r\nminus = %x2D                       ; -\r\nplus = %x2B                        ; +\r\ndigit1-9 = %x31-39                 ; 1-9\r\nunderscore = %x5F                  ; _\r\nint = DIGIT / digit1-9 1*( DIGIT / underscore DIGIT )\r\n\r\n;; Float\r\n\r\nfloat = integer ( frac / frac exp / exp )\r\nzero-prefixable-int = DIGIT *( DIGIT / underscore DIGIT )\r\nfrac = decimal-point zero-prefixable-int\r\ndecimal-point = %x2E               ; .\r\nexp = e integer\r\ne = %x65 / %x45                    ; e E\r\n\r\n;; String\r\n\r\nstring = basic-string / ml-basic-string / literal-string / ml-literal-string\r\n\r\n;; Basic String\r\n\r\nbasic-string = quotation-mark *basic-char quotation-mark\r\n\r\nquotation-mark = %x22            ; "\r\n\r\nbasic-char = basic-unescaped / escaped\r\nescaped = escape ( %x22 /          ; "    quotation mark  U+0022\r\n                   %x5C /          ; \\    reverse solidus U+005C\r\n                   %x2F /          ; /    solidus         U+002F\r\n                   %x62 /          ; b    backspace       U+0008\r\n                   %x66 /          ; f    form feed       U+000C\r\n                   %x6E /          ; n    line feed       U+000A\r\n                   %x72 /          ; r    carriage return U+000D\r\n                   %x74 /          ; t    tab             U+0009\r\n                   %x75 4HEXDIG /  ; uXXXX                U+XXXX\r\n                   %x55 8HEXDIG )  ; UXXXXXXXX            U+XXXXXXXX\r\n\r\nbasic-unescaped = %x20-21 / %x23-5B / %x5D-10FFFF\r\n\r\nescape = %x5C                    ; \\\r\n\r\n;; Multiline Basic String\r\n\r\nml-basic-string-delim = quotation-mark quotation-mark quotation-mark\r\nml-basic-string = ml-basic-string-delim ml-basic-body ml-basic-string-delim\r\nml-basic-body = *( ml-basic-char / newline / ( escape newline ))\r\n\r\nml-basic-char = ml-basic-unescaped / escaped\r\nml-basic-unescaped = %x20-5B / %x5D-10FFFF\r\n\r\n;; Literal String\r\n\r\nliteral-string = apostraphe *literal-char apostraphe\r\n\r\napostraphe = %x27 ; \' Apostraphe\r\n\r\nliteral-char = %x09 / %x20-26 / %x28-10FFFF\r\n\r\n;; Multiline Literal String\r\n\r\nml-literal-string-delim = apostraphe apostraphe apostraphe\r\nml-literal-string = ml-literal-string-delim ml-literal-body ml-literal-string-delim\r\n\r\nml-literal-body = *( ml-literal-char / newline )\r\nml-literal-char = %x09 / %x20-10FFFF\r\n\r\n;; Boolean\r\n\r\nboolean = true / false\r\ntrue    = %x74.72.75.65     ; true\r\nfalse   = %x66.61.6C.73.65  ; false\r\n\r\n;; Datetime (as defined in RFC 3339)\r\n\r\ndate-fullyear  = 4DIGIT\r\ndate-month     = 2DIGIT  ; 01-12\r\ndate-mday      = 2DIGIT  ; 01-28, 01-29, 01-30, 01-31 based on month/year\r\ntime-hour      = 2DIGIT  ; 00-23\r\ntime-minute    = 2DIGIT  ; 00-59\r\ntime-second    = 2DIGIT  ; 00-58, 00-59, 00-60 based on leap second rules\r\ntime-secfrac   = "." 1*DIGIT\r\ntime-numoffset = ( "+" / "-" ) time-hour ":" time-minute\r\ntime-offset    = "Z" / time-numoffset\r\n\r\npartial-time   = time-hour ":" time-minute ":" time-second [time-secfrac]\r\nfull-date      = date-fullyear "-" date-month "-" date-mday\r\nfull-time      = partial-time time-offset\r\n\r\ndate-time      = full-date "T" full-time\r\n\r\n;; Array\r\n\r\narray-open  = %x5B ws  ; [\r\narray-close = ws %x5D  ; ]\r\n\r\narray = array-open array-values array-close\r\n\r\narray-values = [ val [ array-sep ] [ ( comment newlines) / newlines ] /\r\n                 val array-sep [ ( comment newlines) / newlines ] array-values ]\r\n\r\narray-sep = ws %x2C ws  ; , Comma\r\n\r\n;; Inline Table\r\n\r\ninline-table-open  = %x7B ws     ; {\r\ninline-table-close = ws %x7D     ; }\r\ninline-table-sep   = ws %x2C ws  ; , Comma\r\n\r\ninline-table = inline-table-open inline-table-keyvals inline-table-close\r\n\r\ninline-table-keyvals = [ inline-table-keyvals-non-empty ]\r\ninline-table-keyvals-non-empty = key keyval-sep val /\r\n                                 key keyval-sep val inline-table-sep inline-table-keyvals-non-empty\r\n\r\n;; Built-in ABNF terms, reproduced here for clarity\r\n\r\n; ALPHA = %x41-5A / %x61-7A ; A-Z / a-z\r\n; DIGIT = %x30-39 ; 0-9\r\n; HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F"\r\n'}
    lib/ex_abnf.ex:41: ABNF.load/1
    lib/toml.ex:9: Toml.parse/1
    lib/mix/toml.test.ex:20: Mix.Tasks.Toml.Test.run/1
    (mix) lib/mix/cli.ex:58: Mix.CLI.run_task/2
    (elixir) lib/code.ex:363: Code.require_file/2
marcelog commented 8 years ago

Hello,

It seems there's an issue somewhere with multiline grouping, for example:

expression = (
  ws /
  ws comment /
  ws keyval ws [ comment ] /
  ws table ws [ comment ]
)

vs

expression = (ws / ws comment / ws keyval ws [ comment ] / ws table ws [ comment ])

I'll check it out, in the meantime, replace the multiline grouping with just one line, as the example above. I've tried it with the others and it parsed the file successfully.

Cheers!

niahoo commented 8 years ago

Thanks for your fast answer. It works as intended. I figured out that ';' is simply the comment symbol so that was easyto do (I have no knowledge in ABNF).