picoe / Eto.Parse

Recursive descent LL(k) parser for .NET with Fluent API, BNF, EBNF and Gold Grammars
MIT License
148 stars 30 forks source link

Letter parser accepts all Unicode letters as a letter #49

Closed archfrog closed 3 years ago

archfrog commented 3 years ago

Not sure if this is a bug, but the code below accepts all Unicode letters as letters - i.e. Hebrew, Cyrillic, etc.:

Eto.Parse.Parsers.LetterTerminal.Test, Char.IsLetter() test

My guess is that is should have been an explicit check for the ranges 'a'..'z' and 'A'..'Z' (ASCII letters).

If not, please close this issue - I just stumbled across the above and wanted to make sure it was intentional.

cwensley commented 3 years ago

Yes, this is correct. Those are actually letters! even though they are not english letters.

If you only want a-z and A-Z, try using the CharRangeTerminal.

archfrog commented 3 years ago

Thanks for the clarification and the new stuff :-)