neogeny / TatSu

竜 TatSu generates Python parsers from grammars in a variation of EBNF
https://tatsu.readthedocs.io/
Other
403 stars 48 forks source link

Surprising whitespace handling #305

Closed dnicolodi closed 10 months ago

dnicolodi commented 1 year ago

Given this example

import tatsu

parser = tatsu.compile(r'''
statement = 'SELECT' 'FROM' table $ ;
table = name:id ;
id = /[a-z]+/ ;
''')

string = 'SELECT FROM   foo'
value = parser.parse(string, parseinfo=True)

table = value[2]
assert table['name'] == 'foo'

parseinfo = table['parseinfo']
print(parseinfo.tokenizer.text)
print(f'{parseinfo.pos * " "}^')

I find whitespace handling a surprising. The whitespace between FROM and the table name is not skipped over before matching the table rule. This results in correct parsing but the parseinfo for the table rule. I would have expected whitespace to be skipped before attempting to match the table rule.

apalala commented 1 year ago

Could you tur these checks into a pytest unit test, @dnicolodi ?

dnicolodi commented 1 year ago

Sure, I was just not sure that the behavior I expect it the intended one.

apalala commented 1 year ago

A unit test will precisely state the intended outcome.

dnicolodi commented 1 year ago

See #306.