Inconsistency in SAT results when converting predicate to SMT-LIB and parsing it back using .NET API

GennadyGS commented 3 years ago

Z3 4.8.9 works fantastically for me but unfortunately I have an issue migrating to 4.8.10. Some predicates solvable with 4.8.9 hang forever with 4.8.10. However I cannot properly report the issue because it is not reproducible in SMT-LIB representation.

Steps to reproduce:

I am trying to solve some complicated predicate z3Predicate created with .NET API, which was solvable with 4.8.9, but it stucks for infinite time with 4.8.10:

var context = new Context();
var solver = context.MkSolver();
solver.Assert(z3Predicate); // assert some complicated predicate 
var status = solver.Check();

In order to reproduce and investigate the issue I tried to get SMT-LIB representation this way:

...
solver.Assert(z3Predicate); 
var str = solver.ToString();

Result was like the following (irrelevant parts are skipped):

(declare-datatypes ((Special_String_NA_Null 0)) (((Value (value String)) (Null) (NA))))
(declare-datatypes ((Special_String_Exempt_NA_Null 0)) (((Value (value String)) (Null) (NA) (Exempt))))
(declare-fun Fields.ZipCode () Special_String_Exempt_NA_Null)
(declare-fun Fields.State () Special_String_NA_Null)
(declare-fun Fields.City () Special_String_NA_Null)
...
(assert (let ((a!1 (not (or (= Fields.City (Value "")))))
      (a!2 (not (or (= Fields.State (Value "")))))
      (a!3 (not (or (= Fields.ZipCode (Value "")))))
...

Then I tried to reproduce the issue using SMT-LIB representation above in following way:

var context = new Context();
var solver = context.MkSolver();
solver.Assert(context.ParseSMTLIB2String("(declare-datatypes ((Special_String_NA_Null 0))...")); // assert the same predicate in SMT-LIB form
var status = solver.Check();

However I have encountered two problems here:

Firstly, I got the following error:

error "line 11 column 48: Sorts Special_String_Exempt_NA_Null and Special_String_NA_Null are incompatible"

As I discovered, the reason was in conflicting names of constructors Value of types Special_String_NA_Null and Special_String_Exempt_NA_Null. I have fixed this by manually replacing string Value to e.g. Value1 wherever required.

Secondly, the most crucial, after manual fix, predicate was successfully (and quickly) solved. Thus it is satisfiable in SMT-LIB form (when predicate was created using context.ParseSMTLIB2String()), while it hangs forever in original version (when predicate was created with .NET API).

As result, I cannot provide clear steps to reproduce the issue with infinite looping because conversion of predicate to SMT-LIB and parsing it back changes the solution result.

Expected behavior

I expect that converting Z3 predicate to string (using solver.ToString()) should return correct SMT-LIB representation and following solver.Assert(context.ParseSMTLIB2String(...)) should convert it back to equivalent Z3 predicate with the same satisfiability result.

P.S. This is not bug report of issue with 4.8.10 but rather an question how to properly report such issues. Consider this rather as question, because I cannot rely on some documentation to confirm that behavior I expect is really expected by design :). If current behavior is according to design please suggest how else I can provide SMT-LIB representation equivalent to Z3 predicate and reproduce the original issue. If required I can try to come up with more concrete steps of creating the predicate using the .NET API, however it is not straightforward. I hope it does not sound too confusing :) Thanks in advance.

NikolajBjorner commented 3 years ago

The pretty printer does not support renaming conflicting names for algebraic datatype constructors/accessors. As you observed, the result cannot be parsed even though you were able to use the programmatic API to create ADTs with conflicting names. There isn't a plan to support renaming in the pretty printer.
The non-terminating behavior may be dependent on search order especially if you include quantifiers. The reproducible way to re-create this behavior is by using the low level logger. Call the function Log.Open("z3.log"); before any other Z3 interaction. Then the same non-termination can be observed by running on "z3.exe z3.log -v:10 -st". Verbose mode may give some clue how search proceeds. Statistics may indicate heavy hitters. If you use quantifiers, there are axiom profiling tools from ETH (use the configuration parameter "trace=true" and it creates a log file of a different format that can be consumed by the AxiomProfiler).

GennadyGS commented 3 years ago

z3Log_2021_2021-02-14_12-11-46.zip Hello Nikolaj, Thanks a lot for quick answer. Your information is very helpful. First of all, I do not use quantifiers at this moment at all.

I have gathered diagnostic Z3 log (in attach) and found out infinitely repeating fragment at the end:

ax (not (<= (str.len (value Fields.City)) 7)) (<= (str.len (seq.tail (value Fields.City) 0)) 6) ax (not (>= (str.len (value Fields.City)) 1)) (let ((a!1 (str.++ (seq.unit (seq.nth_i (value Fields.City) 0)) (seq.tail (value Fields.City) 0)))) (seq.eq (value Fields.City) a!1))

It is strange, because my original formula does not contain any string operations on variable "Fields.City" (see attached pretty print). I investigated the issue further and discovered that I reused single Context instance throughout the application instead of creating new one for each usage. I came to this some time ago because of sporadic issue with reference counting on disposing Context on one of previous versions (not sure if it is still reproducible) like that:

Microsoft.Z3.Z3Exception: invalid dec_ref command at Microsoft.Z3.Native.Z3_dec_ref(IntPtr a0, IntPtr a1) at Microsoft.Z3.IDecRefQueue.Clear(Context ctx) at Microsoft.Z3.Context.Dispose()

Now recreating the new instances of Context solves my current problem, however (if it is interesting for you) I have suspicion that something has changed in 4.8.10 in this regard, since the same worked fine on 4.8.9. Particularly I suspect that it may be connected with changes around regular expressions, because of such observations:

It looks like namely predicates containing regular expressions stopped working on 4.8.10 if context is reused.
Function Z3_mk_re_concat now fails for arrays with single element
Expression with Z3_mk_re_concat is probably splitted into binary operations and pretty printed in slightly different way in 4.8.10:

(let ((a!1 (re.++ (re.++ (re.++ (re.range "0" "9") (re.range "0" "9")) (re.range "0" "9")) (re.range "0" "9")))) (let ((a!2 (re.++ (re.++ (re.++ a!1 (re.range "0" "9")) (str.to_re "-")) (re.range "0" "9")))) (re.++ (re.++ (re.++ a!2 (re.range "0" "9")) (re.range "0" "9")) (re.range "0" "9"))))

instead of as it was in 4.8.9:

(re.++ (re.range "0" "9") (re.range "0" "9") (re.range "0" "9") (re.range "0" "9") (re.range "0" "9") (str.to_re "-") (re.range "0" "9") (re.range "0" "9") (re.range "0" "9") (re.range "0" "9"))

If you consider my observations as an issue I can create separate one or move on in current one, otherwise you can close it. I have some other issues with 4.8.10 reproducible even with fresh Context, but I will investigate them further and create new issue if required.

PrettyPrint.txt z3DiagnosicLog.zip z3Log_2021_2021-02-14_12-11-46.zip

NikolajBjorner commented 3 years ago

Thanks, the log happens to be replayable also with the latest tip of the master branch. This is useful as the pretty-printed version, even after I manually disambiguated names to make it parseable, is very easy. Also note that the log contains three queries. Note that there is a way to have Z3 produce SMTLIB scripts as well by setting solver.smtlib2_log=. This would not ensure reproducibility but make the logs more portable.

@veanes and @cdstanford - note that there are many regular expression operations being used and that the behavior of the new solver for regular expressions could very well be playing a role.

The preliminary diagnostics is that behavior of Z3 version 4.8.10 and 11 is brittle and can handle the query very quickly depending on the random seed. So setting smt.random_seed=NN to different values of number NN shows variance of solver time. Another area that I typically look for is change of behavior whether using smt.arith.solver=2 or smt.arith.solver=6. The default solver changed since 4.8.9 or 4.8.10 and while it is often much better there are many use cases where the legacy arithmetic solver still outshines the newer solver. However, I did not see any performance difference between arithmetic solvers.

NikolajBjorner commented 3 years ago

Here is the sample output with disambiguated fields

(declare-datatypes ((Special_String_NA_Null 0)) (((Value (value String)) (Null) (NA))))
(declare-datatypes ((Special_String_Exempt_NA_Null 0)) (((Value2 (value2 String)) (Null2) (NA2) (Exempt))))
(declare-fun Fields.ZipCode () Special_String_Exempt_NA_Null)
(declare-fun Fields.State () Special_String_NA_Null)
(declare-fun Functions.IsStateCode (String) Bool)
(declare-fun Fields.City () Special_String_NA_Null)
(declare-fun Fields.StreetAddress () Special_String_Exempt_NA_Null)
(assert (let ((a!1 (not (or (= Fields.City (Value "")))))
      (a!2 (not (or (= Fields.State (Value "")))))
      (a!3 (not (or (= Fields.ZipCode (Value2 "")))))
      (a!6 (not (or ((_ is (Null2 () Special_String_Exempt_NA_Null))
                      Fields.StreetAddress)
                    (= Fields.StreetAddress (Value2 "")))))
      (a!7 (or ((_ is (Null () Special_String_NA_Null)) Fields.City)
               (or (= Fields.City (Value "")))))
      (a!8 (or (and ((_ is (Value (String) Special_String_NA_Null))

                    Fields.State)
                    (Functions.IsStateCode (value Fields.State)))
               ((_ is (NA () Special_String_NA_Null)) Fields.State)))
      (a!9 (or ((_ is (Null () Special_String_NA_Null)) Fields.State)
               (or (= Fields.State (Value "")))))
      (a!10 (re.++ (re.++ ((_ re.loop 5 5) (re.range "0" "9")) (str.to_re "-"))
                   ((_ re.loop 4 4) (re.range "0" "9"))))
      (a!11 (and ((_ is (Value2 (String) Special_String_Exempt_NA_Null))
                   Fields.ZipCode)
                 (str.in_re (value2 Fields.ZipCode)
                            ((_ re.loop 5 5) (re.range "0" "9")))))
      (a!13 (or ((_ is (Null2 () Special_String_Exempt_NA_Null)) Fields.ZipCode)
                (or (= Fields.ZipCode (Value2 ""))))))
(let ((a!4 (not (and ((_ is (NA2 () Special_String_Exempt_NA_Null))
                       Fields.StreetAddress)
                     (and ((_ is (Value (String) Special_String_NA_Null))

           Fields.City)
                          a!1)
                     ((_ is (Value (String) Special_String_NA_Null))
                       Fields.State)
                     a!2
                     (and ((_ is (Value2 (String) Special_String_Exempt_NA_Null))
                            Fields.ZipCode)
                          a!3))))
      (a!5 (not (and ((_ is (NA2 () Special_String_Exempt_NA_Null))
                       Fields.StreetAddress)
                     (and ((_ is (Value (String) Special_String_NA_Null))
                            Fields.City)
                          a!1)
                     (and ((_ is (Value2 (String) Special_String_Exempt_NA_Null))
                            Fields.ZipCode)
                          a!3))))
      (a!12 (or (and ((_ is (Value2 (String) Special_String_Exempt_NA_Null))
                       Fields.ZipCode)
                     (str.in_re (value2 Fields.ZipCode) a!10))
                a!11
                ((_ is (NA2 () Special_String_Exempt_NA_Null)) Fields.ZipCode)
                ((_ is (Exempt () Special_String_Exempt_NA_Null))
                  Fields.ZipCode))))
  (or (and a!4
           (not a!5)
           a!6
           (not a!7)
           (and a!8 (not a!9))
           (and a!12 (not a!13)))
      (and (not a!4)
           a!5
           a!6
           (not a!7)
           (and a!8 (not a!9))
           (and a!12 (not a!13)))))))
(check-sat)

NikolajBjorner commented 3 years ago

I found one source of divergence. It is unrelated to the updates to regular expressions. The root cause remains obscure, but a patch is possible even though it likely only kicks the can down the road. I will be pushing this update soon and you can try out the nightly build or build your own. The other issues may remain and understanding these in detail could be good too.

GennadyGS commented 3 years ago

Thank you Nikolaj,

behavior of Z3 version 4.8.10 and 11 is brittle

Does it mean that it is temporary issue which is going to be resolved in future more stable versions? 4.8.9 works great (except of rare spontaneous fatal Z3 errors which I cure by rerun) and I am primarily interested to upgrade in order to be up to date and not to stuck with one version forever. So I am interested not in legacy but in future-proof (new and actively developed) logic to support my scenarios. I am ready to post issues for any regression and I much appreciate that you always pay attention to my reports.

Log really contains three queries, but problem is not reproducible if to run only third one. It is only reproducible if run three queries in this order and reuse the same instance of Context. Recreating Context for each query also eliminates non-terminating issue. This, by the way, explains why originally I could not reproduce issue using pretty-print SMTLIB output. Please notice that reusing the Context was safe in 4.8.9 and all the issues with Context reuse in 4.8.10 are suspiciously connected with regular expressions and there are a lot of other queries working fine. Could it be that new regular expression solver makes Context stateful and introduces side effects on subsequent queries? Again, recreating the Context solves my problem for now, in case I will not encounter other issues like previous issue with reference counting (described above).

NikolajBjorner commented 3 years ago

The context is that Z3 is brittle for this particular example. It is at worst an isolated regression. At best it is a fluctuation that just surfaces arbitrarily depending on random search order. You should work with the latest versions of Z3. Previous versions of Z3 are not worth working on top of: I cannot backport fixes and for the most part the aim is to avoid regressions.

I have pushed a patch that addresses the non-termination I observed in your example. For a disclaimer I have not determined why precisely it diverged, only fixed the symptom. The string theory solver requires some other major overhauls anyway, but this requires much longer term development.

GennadyGS commented 3 years ago

Ok, thanks, clear

GennadyGS commented 3 years ago

Hello, Nikolaj. Your fix in nightly version works for the example I have provided. Thanks a lot. However, I have observed other issues, which I am going to report separately, so you can close this issue.

The context is that Z3 is brittle for this particular example. It is at worst an isolated regression. At best it is a fluctuation that just surfaces arbitrarily depending on random search order.

It sounds too much self-critical :). Z3 is great! Our solution (btw it is intended for test cases generation) uses Z3 for solving tens of thousands of predicates usually significantly more complicated than example above. Everything (without exceptions!) works stable with version 4.8.9 (except for rare sporadic issues with reference counting mentioned above). As for random number choice it is absolutely arbitrary. I specially tried 3 different ones and everything still works. So if what you described was the case, it would be nightmare for us, because it would not be possible to maintain our solution any more. So I hope that our solution will ultimately work at least the same with future versions of Z3! Thanks again!

NikolajBjorner commented 3 years ago

This issue seems closed. Open separate issues for other specifics.

Z3Prover / z3

Inconsistency in SAT results when converting predicate to SMT-LIB and parsing it back using .NET API #5027

Steps to reproduce:

Expected behavior