vnmakarov / yaep

Yet Another Earley Parser
Other
135 stars 13 forks source link

Trying to output multiple parse trees with equal costs doesn't work #7

Open szakharchenko opened 7 years ago

szakharchenko commented 7 years ago

The following mode has problems (ignoring the glaring obvious issues with memory cleanup):

yaep_set_one_parse_flag(g,0); yaep_set_cost_flag(g,1);

The attached code yields:

Translation: 0: ALTERNATIVE: node=1, next=2 1: ABSTRACT: a2 ( 3 4 ) 3: TERMINAL: code=97, repr='a' 4: TERMINAL: code=98, repr='b' 2:

while it should yield something like

Translation: 0: ALTERNATIVE: node=1, next=2 1: ABSTRACT: a1 ( 3 4 ) 3: TERMINAL: code=97, repr='a' 4: TERMINAL: code=98, repr='b' 2: ALTERNATIVE: node=5, next=nil 5: ABSTRACT: a2 ( 3 4 )

ambig2_test.zip

szakharchenko commented 7 years ago

NB: The "2:" in the first quote above is not a copy/paste error: there's really no more output on that line. If one of the nodes is a clear winner (has lower cost), then only that node is output (correctly), eg.:

Translation: 0: ABSTRACT: a2 ( 1 2 ) 1: TERMINAL: code=97, repr='a' 2: TERMINAL: code=98, repr='b'

szakharchenko commented 7 years ago

Fixed by making prune_to_minimal saner.

TheCount commented 5 years ago

Fixed by making prune_to_minimal saner.

What was your fix? Did you ever commit it?

I found two problems with find_minimal_translation() (which calls prune_to_minimal()). One is an immediate reference to freed memory in the next statement (test 47 exposes this). This is a simple fix: the local variables entry and entry_1 can simply be removed as they are never read; incidentally this also kills the illegal reference. Let's not get confused by this in the following discussion.

The second problem is exposed by altering test 46 to allow a non-null parse_free function. In this case, prune_to_minimal() returns the tree passed to find_minimal_translation() unaltered (except for the cost entries, of course), which is the correct behaviour because both alternatives contained in the tree have the same cost. Yet, traverse_pruned_translation(), called right after prune_to_minimal(), somehow fails to add one of the alternatives to the list of nodes not to be freed. This results in a valid node to be freed and subsequent errors, e. g.:

==15987==    at 0x11510A: print_node (yaep.c:4971)
==15987==    by 0x1156A3: print_node (yaep.c:5067)
==15987==    by 0x11541D: print_node (yaep.c:5023)
==15987==    by 0x11574A: print_parse (yaep.c:5085)
==15987==    by 0x117784: make_parse (yaep.c:5828)
==15987==    by 0x117AFB: yaep_parse (yaep.c:5955)
==15987==    by 0x109C3B: main (test46.c:82)
==15987==  Address 0x592eb80 is 0 bytes inside a block of size 32 free'd
==15987==    at 0x4C30D3B: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==15987==    by 0x10997D: test_parse_free (common.h:18)
==15987==    by 0x116028: find_minimal_translation (yaep.c:5329)
==15987==    by 0x117732: make_parse (yaep.c:5823)
==15987==    by 0x117AFB: yaep_parse (yaep.c:5955)
==15987==    by 0x109C3B: main (test46.c:82)
==15987==  Block was alloc'd at
==15987==    at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==15987==    by 0x109935: test_parse_alloc (common.h:11)
==15987==    by 0x1157B1: place_translation (yaep.c:5121)
==15987==    by 0x11746A: make_parse (yaep.c:5760)
==15987==    by 0x117AFB: yaep_parse (yaep.c:5955)
==15987==    by 0x109C3B: main (test46.c:82)`

My guess is that fixing these memory handling problems will also fix the output issue.

TheCount commented 5 years ago

The bug is simply that when traverse_pruned_translation() goes through a linked list of alt nodes, it adds all the alternatives, but not the alt nodes themselves, except for the first one. I'll make a fix and publish it as part of a pull request for #17.

vnmakarov commented 5 years ago

The following mode has problems (ignoring the glaring obvious issues with memory cleanup):

yaep_set_one_parse_flag(g,0); yaep_set_cost_flag(g,1);

The attached code yields:

Translation:
0: ALTERNATIVE: node=1, next=2
1: ABSTRACT: a2 ( 3 4 )
3: TERMINAL: code=97, repr='a'
4: TERMINAL: code=98, repr='b'
2:

while it should yield something like

Translation:
0: ALTERNATIVE: node=1, next=2
1: ABSTRACT: a1 ( 3 4 )
3: TERMINAL: code=97, repr='a'
4: TERMINAL: code=98, repr='b'
2: ALTERNATIVE: node=5, next=nil
5: ABSTRACT: a2 ( 3 4 )

ambig2_test.zip

Thank you for the test and opening the issue. I've fixed the bug. Now there are no vlagrind complaints on your test and YAEP generates the expected translation.

vnmakarov commented 5 years ago

The bug is simply that when traverse_pruned_translation() goes through a linked list of alt nodes, it adds all the alternatives, but not the alt nodes themselves, except for the first one. I'll make a fix and publish it as part of a pull request for #17.

Thanks for working on the issue. I've committed another version (w/o recursion because it might be very deep for some cases) of the patch into the repository.

TheCount commented 5 years ago

Thanks for working on the issue. I've committed another version (w/o recursion because it might be very deep for some cases) of the patch into the repository.

It's a tail call, so it shouldn't actually recurse, but your version adfd3edf7f6f21182ccd0c567f5054cd9f8c3e91 is of course also fine.