patrickhuber / Pliant

MIT License
26 stars 4 forks source link

completeness of Ebnf support according to ISO 14977 #66

Open ArsenShnurkov opened 7 years ago

ArsenShnurkov commented 7 years ago

I need extensions to specify unicode character by it's code (just "\\?" is not enough for me)

I mean paragraph §4.19 in ISO 14977 ( https://www.cl.cam.ac.uk/~mgk25/iso-14977.pdf ) special-sequence-symbol rule and corresponding extensibility mechanism in parser

Also I want to see the grammar (EBNF grammar from standart) to be parsed from it's text definition, not only defined in code.

patrickhuber commented 7 years ago

The EbnfParser class currently supports a loose ebnf syntax, though I agree its not standards compliant. Can you give feedback on the EbnfGrammarGenerator and EbnfParser?

For the unicode character are you looking for the syntax here: https://msdn.microsoft.com/en-us/library/aa664669(v=vs.71).aspx ?

bilsaboob commented 7 years ago

Hey, ArsenShnurkov - is it only unicode characters you need to support or is there some other escape symbols that is missing to make it usable for you? I'm gonna look into this one, but not sure if I wan't dive deep into that iso document ;)

ArsenShnurkov commented 7 years ago

1) comments should be (**), not /**/ 2) there are 2 variants of EBNF - with commas and without. Your parser support only the second one. in standard it's paragraph "4.5 Single-definition A single-definition consists of an ordered list of one or more syntactic-terms separated from each other by a concatenate-symbol." (which is comma) 3) I was unable to specify symbol by code as space1 = "\u0020" ; space2 = "\x0020" ; space3 = /\u0020/ ; space4 ~ /\u0020/ ; space5 = \u0020 ; first four just give no match, last one gives parsing error for grammar. 4) syntax is processed strangely - see https://github.com/patrickhuber/Pliant/issues/70 Grammar:

file = "1" { "2" } "1" ;
File:
string input = "12221";
Result:
Recognized: False, Accepted: False
Error at position 1
({2}, 0, 0)

patrickhuber commented 3 years ago

I'm going to add tests for your cases above to make sure the new language grammar conforms to the specification before closing this issue.