erikrose / parsimonious

The fastest pure-Python PEG parser I can muster
MIT License
1.79k stars 126 forks source link

INI parsing example does not work for files starting with empty lines #235

Open nullst opened 1 year ago

nullst commented 1 year ago

The INI grammar as given in the current version of README.rst has the following rule:

expr        = (entry / emptyline)*

However, the IniVisitor class slightly below does not handle empty lines. This is masked by the fact that the entry rule allows an arbitrary amount of whitespace at the end, so the example INI file is parsed as a sequence of entries, with an emptyline rule left unused (even if you add a bunch of empty lines between sections). But the code breaks if the INI file starts with an empty line.

This is quite minor, but the fact that the emptyline rule is unused really confused me when I tried to write my first grammar. I started with a similar expr rule, my entry rule didn't allow extra new lines at the end (why should it?), and then I got quite confused on which part of the example code in README was responsible for omitting the empty lines from the visited_children list.

A minimal example: just modify the data variable in the README to start with a few new lines. Visiting the parsed tree leads to an exception:

Traceback (most recent call last):
  File "/home/me/.local/lib/python3.9/site-packages/parsimonious/nodes.py", line 213, in visit
    return method(node, [self.visit(n) for n in node])
  File "/home/me/fun/programming/test.py", line 199, in visit_expr
    output.update(child[0])
ValueError: dictionary update sequence element #0 has length 0; 2 is required
optixx commented 11 months ago

I stumbled upon the same problem. My fix is, to change the grammar section = ws lpar word rpar ws and the visitsection() to consume the whitespace before lpar `,, section, * = visited_children`