Z3Prover / z3

The Z3 Theorem Prover
Other
10.36k stars 1.48k forks source link

Regex performance cliff when using `InRe` #5648

Open ahelwer opened 3 years ago

ahelwer commented 3 years ago

With Z3 4.8.12. The following python code reproduces the issue:

from z3 import *

a = Re('a')
b = Re('b')
r1 = Concat(a, Star(Concat(b, a))) # a(ba)*
r2 = Concat(Star(Concat(a, b)), a) # (ab)*a

s = Solver()
s.add(r1 != r2)
print(s.check()) # This is nearly instantaneous

x = String('x')
xr1 = InRe(x, r1)
xr2 = InRe(x, r2)
s = Solver()
s.add(xr1 != xr2)
print(s.check()) # This times out

Use case is analyzing a set of ACL constraints, some of which have regexes. Not sure cases like this will actually crop up in the sort of constraints the system has to actually analyze, but thought I'd document it. Feel free to close if this is not a bug.

NikolajBjorner commented 3 years ago

It is a nice example of a limitation of derivatives. They fail to leverage equivalence. I am not sure what the general approach should be, but @veanes and @cdstanford might like to check it out. One strategy is to simply improve the simplifier preprocessor. But this might be a point fix. Another strategy is to infer a consequence of a state such as x in R1, x not in R2 to be that R1 != R2, which uses a different unfolding.

cdstanford commented 3 years ago

The workaround here is to specify all constraints on one string using the same regex rather than several regexes:

from z3 import *

x = String('x')
a = Re('a')
b = Re('b')
r1 = Concat(a, Star(Concat(b, a))) # a(ba)*
r2 = Concat(Star(Concat(a, b)), a) # (ab)*a
diff = Union(
    Intersect(r1, Complement(r2)),
    Intersect(r2, Complement(r1)),
)

s = Solver()
s.add(InRe(x, diff))
print(s.check())

I remember we had a preprocessing step at some point coded that combines all constraints on x into one, but it was turned off (due to performance issues?)

IMO this is not a matter of leveraging equivalence, equivalence between regexes is not needed here -- what we need is to notice that there are two constraints on x and simply combine them as a Boolean combination.

ahelwer commented 3 years ago

Don't know whether it matters, but I'm actually using an array of type string -> string to model a set of key/value constraints. So my code really looks more like:

from z3 import *

a = Re('a')
b = Re('b')
r1 = Concat(a, Star(Concat(b, a))) # a(ba)*
r2 = Concat(Star(Concat(a, b)), a) # (ab)*a

s = Solver()
s.add(r1 != r2)
print(s.check()) # This is nearly instantaneous

labels = Array('labels', StringSort(), StringSort())
lr1 = InRe(Select(labels, String('key')), r1)
lr2 = InRe(Select(labels, String('key')), r2)
s = Solver()
s.add(lr1 != lr2)
print(s.check()) # This times out

and there are multiple constraints over labels for different keys; the idea is I can then check whether two sets of constraints are equivalent, or query whether a given concrete set of values satisfies a set of constraints. So I guess ideally your simplifier would identify that multiple Select(labels, String('key')) with the same key refer to the same variable.

I also considered modeling this as a set of 2-tuples but the Array API is significantly easier to use. If you have alternative suggestions for how to model this I am also open to them. We have a set of constraints analogous to:

rule1:
  allow:
    'env' : ['staging', 'prod']
    'os' : ['windows', 'linux']
    'key' : 'a(ba)*'
  deny:
    'env' : ['prod']
    'os' : 'linux'
rule2:
  allow:
    'env' : 'staging'
    'os' : 'windows'
    'key' : '(ab)*a'

where deny constraints take precedence over allow constraints. These two rules should be equivalent.

NikolajBjorner commented 3 years ago

It matters to the extent that the pre-processing approach is going to be very partial. It can't handle InRe(x, R1) and InRe(y, R2) and (x = y or .something_equivalent_to_false). It can only deal with InRe(x, R1) and InRe(x, R2) or similar.

NikolajBjorner commented 3 years ago

Maybe you don't need arrays, but just uninterpreted functions or even constants. If you don't use "store' then arrays are not really of any use and you waste overhead of extensionality.

ahelwer commented 3 years ago

@NikolajBjorner good to know that rule of thumb about Z3 arrays; I've used them in a couple other hobby projects, never used store though. I will look into simplifying my constants. It's possible I could structure my expressions such that I compare regexes directly, bypassing InRe. It would reduce the generality of the expressions I create (so I have to create an entirely new expression for comparing two rules vs. checking whether a rule evaluates to true for some concrete value of a constant) but might be an acceptable workaround. Will experiment today.

ahelwer commented 3 years ago

Update: looks like uninterpreted functions are a nice drop-in replacement for arrays when you don't use Store:

from z3 import *

a = Re('a')
b = Re('b')
r1 = Concat(a, Star(Concat(b, a))) # a(ba)*
r2 = Concat(Star(Concat(a, b)), a) # (ab)*a

s = Solver()
s.add(Distinct(r1, r2))
print(s.check()) # This is nearly instantaneous

map = Function('map', StringSort(), StringSort())
lr1 = InRe(map(String('key')), r1)
lr2 = InRe(map(String('key')), r2)
s = Solver()
s.add(Distinct(lr1, lr2))
print(s.check()) # This times out

Of course it still times out solving it but I guess the simplification will be easier. Also kudos to this stackoverflow answer for introducing me to Distinct.

Swalkyn commented 1 year ago

I am working on extended regexes modeling and am taking an approach similar to Loring et al., 2018. It relies on equivalence constraints on string variables, and cannot use the workaround described above by @cdstanford. At the moment, z3 times out too frequently for this approach to be viable.

I was wondering if there was a plan to tackle this issue. I am unable to come up myself with a solution, but, if there is one, could give a try at an implementation.

I can also provide multiple examples of inputs with string equivalence constraints on which z3 times out, if that helps.

NikolajBjorner commented 1 year ago

The solver is only trained on what we have access to, and even then, it isn't trained on many of the benchmark sets that are available on SMTLIB. The reason for the second part is that these benchmarks are not all necessarily based on user needs. Note that the z3str3 solver is better trained on the standard benchmark sets but has still several stability bugs open. Driving features by user needs is how I would set priorities, so sharing such benchmarks would be a very good starting point.

cdstanford commented 1 year ago

@Swalkyn: I would also be interested, do you have a minimal example or benchmark?

Swalkyn commented 1 year ago

Apologies for the long time it took me to answer. Here are 2 minimal examples:

(declare-fun s2 () String)
(declare-fun s3 () String)
(assert
  (and (str.in_re s2 (re.++ (re.++ re.all (str.to_re "a")) re.all))
       (str.in_re s3 (re.* (str.to_re "b")))
       (= s3 s2)))
(check-sat)

This models the PCRE/Java regex (?=.*a)b*.

(declare-fun s6 () String)
(declare-fun s7 () String)
(declare-fun s3 () String)
(declare-fun s2 () String)
(assert 
  (and (str.in_re s2 (re.comp (re.++ (str.to_re "x") re.all)))
       (str.in_re s3 (re.* (str.to_re "x")))
       (str.in_re s7 re.allchar)
       (str.in_re s6 (re.++ re.all (re.++ re.allchar (str.to_re "x"))))
       (= s7 s2)
       (= (str.++ s3 s7) s6)))
(check-sat)

This corresponds to the PCRE/Java regex x*(?!x).(?<=.x).

Both models are unsat. On both examples, Z3 returns unkown with a 10s timeout. Clearly, example 1 can be modeled with regex constraints only. It is also possible for example 2, although it is not as easy. In more involved cases, modeling using regex constraints only would require a large number of constraints, or might be impossible.

These constraints were generated using java-smt.

cdstanford commented 1 year ago

@Swalkyn Thanks. I verified the behavior you mention. For reference, here's the workaround on the first example (which as you clearly stated doesn't apply to your use case), but which does return unsat quickly and is equivalent, which is a good sanity check.

(declare-fun s2 () String)
(assert
  (str.in_re s2 (re.inter
    (re.++ (re.++ re.all (str.to_re "a")) re.all)
    (re.* (str.to_re "b")))))
(check-sat)

I don't know how to deal with your second example or, in general, examples like it, but it sounds like a good open question. Would be curious to see if any other current solvers can handle these.

NikolajBjorner commented 1 year ago

See https://arxiv.org/pdf/2212.02317.pdf

Swalkyn commented 1 year ago

@cdstanford, in case you're interested. I investigated for a bit and tested two other solvers on the above examples:

cdstanford commented 1 year ago

Surprised that Noodler fails! This seems to match their intended use case.

NikolajBjorner commented 1 year ago

Similar reaction: it should be in scope, though Noodler is inherently incomplete for unsat. Sometimes digging deeper helps clarify guesswork: fx It could be something trivial, such as unsupported regex operations.

NikolajBjorner commented 1 year ago

but also fails on slightly more complex inputs.

What are these? Do they have characteristics that go beyond the basic examples? The basic examples are also in principle solvable using Ostrich.