rindPHI / isla

The ISLa (Input Specification Language) language & solver.
https://isla.readthedocs.org
GNU General Public License v3.0
56 stars 8 forks source link

[BUG] ISLa ignores constraints if solution length is not stated explicitly #73

Closed leonbett closed 1 year ago

leonbett commented 1 year ago

Describe the bug ISLa seems to require an explicit length constraint. Otherwise, it outputs an incorrect solution that disregards the constraint.

To Reproduce Steps to reproduce the behavior:

  1. Run the program below
  2. The output is as follows:
  3. forall_ok: 12340 forall_bug: 5

Expected behavior forall_bug should have the same value as forall_ok, as the length can be inferred from the XPaths.

System/Installation Specs:

Additional context

Bug reproducer.py


from isla.solver import ISLaSolver

IBAN_GRAMMAR = '''
<start> ::= <iban> 
<iban> ::= <digits>
<digits> ::= <digit> | <digit> <digits>
<digit> ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
'''

constr_digit_1 = '(= <iban>.<digits>.<digit> "1")'
constr_digit_2 = '(= <iban>.<digits>.<digits>.<digit> "2")'
constr_digit_3 = '(= <iban>.<digits>.<digits>.<digits>.<digit> "3")'
constr_digit_4 = '(= <iban>.<digits>.<digits>.<digits>.<digits>.<digit> "4")'

length_constr = '(= str.len(iban) 5)'

forall_ok = f'''forall <iban> iban in start:
  (
    {length_constr} and {constr_digit_1} and {constr_digit_2} and {constr_digit_3} and {constr_digit_4}
  )'''

forall_bug = f'''forall <iban> iban in start:
  (
    {constr_digit_1} and {constr_digit_2} and {constr_digit_3} and {constr_digit_4}
  )'''

def main():
    print('forall_ok: ')
    solver = ISLaSolver(
        grammar=IBAN_GRAMMAR,
        formula=forall_ok,
        timeout_seconds=10,
    )
    print(solver.solve())

    print('forall_bug: ')
    solver = ISLaSolver(
        grammar=IBAN_GRAMMAR,
        formula=forall_bug,
        timeout_seconds=10,
    )
    print(solver.solve())

if __name__ == '__main__':
    main()```
rindPHI commented 1 year ago

Hi @leonbett!

Sorry, but this is no bug (the main bug, I didn't have a look at the "bonus" yet) ;)

Have a look at how ISLa "desugars" the second constraint:

from isla.language import unparse_isla, parse_isla

IBAN_GRAMMAR = """
<start> ::= <iban> 
<iban> ::= <digits>
<digits> ::= <digit> | <digit> <digits>
<digit> ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
"""

constr_digit_1 = '(= <iban>.<digits>.<digit> "1")'
constr_digit_2 = '(= <iban>.<digits>.<digits>.<digit> "2")'
constr_digit_3 = '(= <iban>.<digits>.<digits>.<digits>.<digit> "3")'
constr_digit_4 = '(= <iban>.<digits>.<digits>.<digits>.<digits>.<digit> "4")'

print(unparse_isla(parse_isla(constr_digit_2, IBAN_GRAMMAR)))
# Yields:
# (forall <iban> iban="<digit>{<digit> digit}" in start:
#    (= digit "2") and
# forall <iban> iban_0="<digit>{<digit> digit_0}<digits>" in start:
#   (= digit_0 "2"))

"Free" X-path-style expression are universally quantified per default. That means if there's no second <digit> in your input, the formula is "vacuously satisfied." Adding a length constraint enforcing the required number of digits solves the "problem" as expected.

The semantics of X-Path-style expressions is explained in the ISLa language specification.

Closing this. Please file a separate report if you consider the "bonus bug" important.

Cheers, Dominic

leonbett commented 1 year ago

Thanks for your quick reply. This seems a bit counter-intuitive to me, but fair enough. As for the bonus bug, up to you; I stumbled upon it, and I think others might as well. But since there is a workaround, it is not critical.

rindPHI commented 1 year ago

I fully understand that this seems counter-intuitive to you; I fear that frequently happens in logic systems (e.g., implication...). Yet, if we wanted to enforce the existence of some path in an XPath-style expression, there would be more hidden magic in the underlying translation, and expressing something "modulo existence" (as it's the default now) would become impossible in the language. Effectively, we would thus restrict its expressiveness.

Suppose you want to have everything as transparent as possible. In that case, my advice is to use the ISLa core syntax (i.e., no XPath-style expressions, not "unbound nonterminals," explicit "in" expressions, etc.). This blows up your constraint, but there will be no potentially counter-intuitive translation under the hood.

rindPHI commented 1 year ago

Hmm strange things happen in the second bug. I think I'd rather not address it for now, these things should disappear after I successfully finished my plan to re-build the parser with "language levels." The bug might be related to your using of S-expressions and XPath-Style syntax together.