igordejanovic / parglare

A pure Python LR/GLR parser - http://www.igordejanovic.net/parglare/
MIT License
136 stars 32 forks source link

Iterating through Forest fails in NodeNonTerm solutions property #147

Closed LVrecar closed 1 year ago

LVrecar commented 1 year ago

Description

Trying to build a grammar, which I know is ambiguous. On a small example, 8 different parses exist (I'm planning on doing disambiguation later to reduce this number, for now I'm just trying to see how the parser works). When I try to print them by iterating through the Forest, it prints 3 and then crashes (see traceback below)

What I Did

My code is set up as follows:

Grammar file, called mini-grammar.pg

term: var | app;
app: term " "? term;

terminals
var: /[a-z]+?/;

Python file

from parglare import GLRParser, Grammar, trees

g = Grammar.from_file("mini-grammar.pg", ignore_case=True)
parser = GLRParser(g, actions={})

result = parser.parse("x y z w")
print(f"FOUND {result.solutions} SOLUTIONS")
for idx, tree in enumerate(result):
    print("######")
    print(tree.to_str())

Output from running the Python file

FOUND 8 SOLUTIONS
######
tree omitted for brevity
######
tree omitted for brevity
######
tree omitted for brevity
######
Traceback (most recent call last):
  File "/home/username/phd/glr-python-test/test.py", line 110, in <module>
    print(tree.to_str())
  File "/home/username/.local/lib/python3.10/site-packages/parglare/trees.py", line 231, in to_str
    return to_str(self)
  File "/home/username/.local/lib/python3.10/site-packages/parglare/trees.py", line 59, in to_str
    return visitor(root, tree_node_iterator, visit)
  File "/home/username/.local/lib/python3.10/site-packages/parglare/trees.py", line 392, in visitor
    next_elem = next(it)
  File "/home/username/.local/lib/python3.10/site-packages/parglare/trees.py", line 30, in _iter
    for i in n.children:
  File "/home/username/.local/lib/python3.10/site-packages/parglare/trees.py", line 271, in __getattr__
    self._children = self._enumerate_children(self.counter)
  File "/home/username/.local/lib/python3.10/site-packages/parglare/trees.py", line 227, in _enumerate_children
    children.append(self.__class__(c, new_counter))
  File "/home/username/.local/lib/python3.10/site-packages/parglare/trees.py", line 262, in __init__
    super().__init__(root, counter)
  File "/home/username/.local/lib/python3.10/site-packages/parglare/trees.py", line 203, in __init__
    solutions = root.possibilities[possibility].solutions
  File "/home/username/.local/lib/python3.10/site-packages/parglare/trees.py", line 141, in solutions
    return reduce(lambda x, y: x*y, (c.solutions for c in self.children))
TypeError: reduce() of empty iterable with no initial value

This seems to be an issue with the reduce() function as it has no default value supplied for the case when self.children is empty

Changing line 141 in trees.py from

return reduce(lambda x, y: x*y, (c.solutions for c in self.children))

to

return reduce(lambda x, y: x*y, (c.solutions for c in self.children), 1)

and thereby providing a default value, seems to solve the problem (I also tried to set the default to be 0 but got different errors)

LVrecar commented 1 year ago

On further thought/investigaion, I'm not sure if the fix I suggest is actually the way to go. For example, removing the optional whitespace from the grammar seems to find fewer parses of the string (and what I think is the correct number of parses), which is actually better. More detailed inspection when trying to find the source of the problem has shown me that the optional whitespace nonterminal (" _opt") has 0 children, but still has the reduce call, which is what triggers the error described above. Do you maybe know why that could be?

I've now learned (after reading the documentation more carefully) that whitespace in input strings gets ignored so my rules don't need to account for optional whitespace between the elements on the right hand side. I'll remove it from my grammar and that will fix my current issue with the code failing, but I'm wondering if there was something wrong with my initial approach that I should avoid in the future in order not to get such errors.

Thanks!

igordejanovic commented 1 year ago

Hi. Yes, whitespaces are ignored by default so you don't need to burden your grammar with manual handling if you don't need some special processing.

Nevertheless, the parser shouldn't fail in to_str so this is most likely a bug. Thanks for reporting. I'll investigate in the following days.

igordejanovic commented 1 year ago

@LVrecar It was indeed a bug with reduce and the fix you suggested with adding 1 as the default is the right fix. The problem was with optional expressions which has pass_none as a default action over EMPTY alternative. In case of reducing EMPTY the list of children for the optional match would be 0 and then the reduce fails as you have observed.

Thanks for the report and the fix!