rifatarefin / treevada

Code for ICSE 2024 paper "Fast Deterministic Black-box Context-free Grammar Inference"
https://dl.acm.org/doi/10.1145/3597503.3639214
MIT License
3 stars 1 forks source link

Crash when non-closed brackets in string #1

Closed maxeisele closed 1 month ago

maxeisele commented 2 months ago

Hi,

I found that treevada crashes, when there are non-closed brackets in a string e.g. "{". This string is a valid for a json parser. Here is a traceback of the error, indicating that the issue is in the method braces_tree.

 Traceback (most recent call last):
  File "search.py", line 141, in main
    start_grammar: Grammar = build_start_grammar(oracle, guide_examples, bbl_bounds)
  File "/treevada/start.py", line 88, in build_start_grammar
    trees, classes = build_trees(oracle, leaves)
  File "/treevada/start.py", line 338, in build_trees
    best_trees = build_naive_parse_trees(leaves, [], oracle)
  File "/treevada/start.py", line 162, in build_naive_parse_trees
    new_children, brackets = braces_tree(leaf_list, index = 0, root= True)
  File "/treevada/start.py", line 148, in braces_tree
    index+=1
TypeError: 'int' object is not iterable
rifatarefin commented 2 months ago

Thanks for finding this out. Could you provide the seed string causing this crash?

maxeisele commented 2 months ago

As written in the original post, the seed "{" (with the quotation marks) crashes.

rifatarefin commented 1 month ago

The implementation should ignore brackets surrounded by quotes from creating hierarchies. Unfortunately, I missed this corner case when a single bracket character is within quotes. The issue is an interesting find, I added this check in the braces_tree method, it should fix the bug. https://github.com/rifatarefin/treevada/blob/dc9d29c106ec3246607d9c3bfb9ffd4abac16120/start.py#L140