Z3Prover / z3

The Z3 Theorem Prover
Other
10.29k stars 1.48k forks source link

Soundness bug on NRA formula (master + debug) without options #3484

Closed wintered closed 4 years ago

wintered commented 4 years ago

Consider the below trace which shows a soundness bug on a NRA formula. Z3 (master + debug) return sat, CVC4 gives unsat on bug.smt2. Feeding Z3's model to the formula yields unsat.

No related issues: I have scanned the other open "soundness bug issues". None of those has the following characteristics (1) NRA, (2) master + debug branch (3) no options.

% z3-8717c7d bug.smt2
sat
% z3debug-8d39694 bug.smt2
sat
% cvc4 bug.smt2
unsat
% cat bug.smt2
(declare-fun a () Real)  
(declare-fun b () Real)    
(declare-fun c () Real)  
(declare-fun d () Real)  
(declare-fun v () Real)   
(declare-fun e () Real)   
(declare-fun f () Real)   
(declare-fun w () Real)   
(declare-fun g () Real)   
(declare-fun h () Real)   
(declare-fun i () Real)   
(declare-fun j () Real)   
(declare-fun ep () Real)   
(declare-fun k () Real)    
(declare-fun l () Real)   
(declare-fun m () Real)   
(declare-fun n () Real)    
(declare-fun o () Real)   
(declare-fun p () Real)   
(declare-fun q () Real)   
(declare-fun r () Real)   
(declare-fun s () Real)   
(assert (not (exists ((t Real)) (= (and (and (and (= l 0.0)) 
(< (+ g (/ (/ (- a m) m) (* 2.0 (+ b r)))) h)) 
(<= a (- 6 d (- v ep )))) 
(or (= (= (<= 0.0 t) (and (<= l (* d (- v ep))) (<= 0 ep))) (= j 2.0)))))))   
(assert (not (exists ((u Real)) (=> (= 0.0 (* ( / 91 i ) q  e k)) 
(= (- c w) 2.0) (= o 2.0)))))         
(assert (= c (+ w p ) v (+ ep p)))     
(assert (= f (/ l s)))   
(check-sat)        

OS: Ubuntu 18.04 Commit: 8d39694 (master) + 8d39694 (debug)

NikolajBjorner commented 4 years ago

duplicate of similar nlsat soundness bugs

zhendongsu commented 4 years ago

Nikolaj, which similar open bugs were you referring to? We have checked very carefully before reporting to avoid unnecessary duplicates. Thanks. @NikolajBjorner

NikolajBjorner commented 4 years ago

The original post is #2650. It was somehow closed on Feb 2 by your group, but there is even a unit test that exercises the issues.

zhendongsu commented 4 years ago

Okay, thanks, Nikolaj; we didn't see this as it was closed.

wintered commented 4 years ago

Hi Nikolaj,

Thank you for your comments. We tried to dive deeper into this by inspecting the unit tests for nlsat in src/test/nlsat.cpp. Is the unit test that you mentioned tst11? If I run the nlsat unit tests on the Z3 master (41c68d6) and debug (c5a70ae) branches, all the nlsat unit tests appear to pass (please see the attached log). Furthermore, for the first 5 formulas in #2650, Z3 now returns the correct results. However, it still works incorrectly on the 6th, 7th, and the recently added formulas.

Thank you in advance

debug_c5a70ae.log master_41c68d6.log

NikolajBjorner commented 4 years ago

it is tst11, but it does not check any correctness objectives so it just succeeds. while these tests can be used to isolate root causes it is extra work to add a correctness objective to this code.

zhendongsu commented 4 years ago

Nikolaj, since z3 works correctly on some of these formulas and incorrectly on some of the others, how could one be sure that these are duplicates or not? Thanks for sharing any tips to help us better triage our tests. @NikolajBjorner

NikolajBjorner commented 4 years ago

The common trait is to use nlsat.check_lemmas=true and it ends up failing (a tautology test).

zhendongsu commented 4 years ago

Okay, thanks, Nikolaj; we'll try it. I guess, even so, some of the tests could still be pointing to different root causes, correct?

NikolajBjorner commented 4 years ago

At this point, basically there are two main bugs that involve nlsat. One is related to Sturm sequences and how they are used for comparing polynomials (the bug has some other number). The other is this one.

The other classes of bugs that I have mostly seen are around model construction for underspecified operators (division), and there have been some around parameter fuzzing (simplification in nlsat and reordering and their combinations). For the most part these bugs don't affect standard uses. I have taken them, thanks to your efforts, mostly, but I estimate that the impact of dealing with these bugs is not too great.

zhendongsu commented 4 years ago

Thanks for this additional information, Nikolaj. Things are a lot clearer now as to how you had commented on and triaged these tests. And, thanks again for all your efforts!

zhendongsu commented 4 years ago

On a separate note, Nikolaj, although not having been our focus, would the following interest you at all before we file more?

[550] % cat small.smt2
(simplify (cos (* (/ 1 4) pi)))
[551] % 
[551] % z3san small.smt2
=================================================================
==15537==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7ffd78f67fd0 at pc 0x564da9d5b9bf bp 0x7ffd78f67f10 sp 0x7ffd78f67f00
READ of size 8 at 0x7ffd78f67fd0 thread T0
    #0 0x564da9d5b9be in algebraic_numbers::anum::is_basic() const ../src/math/polynomial/algebraic_numbers.h:372
    #1 0x564da9d649d7 in algebraic_numbers::manager::imp::save_intervals::restore_if_too_small() (/home/suz/software/z3san/build-03252020030602-a749587/z3+0x2c719d7)
    #2 0x564da9d5ed87 in algebraic_numbers::manager::imp::is_rational(algebraic_numbers::anum&) (/home/suz/software/z3san/build-03252020030602-a749587/z3+0x2c6bd87)
    #3 0x564da9d583de in algebraic_numbers::manager::is_rational(algebraic_numbers::anum const&) ../src/math/polynomial/algebraic_numbers.cpp:2887
    #4 0x564da9acb84d in arith_decl_plugin::mk_numeral(algebraic_numbers::anum const&, bool) ../src/ast/arith_decl_plugin.cpp:79
    #5 0x564da7825bfa in arith_util::mk_numeral(algebraic_numbers::anum const&, bool) ../src/ast/arith_decl_plugin.h:406
    #6 0x564da9373f51 in arith_rewriter::mk_power_core(expr*, expr*, obj_ref<expr, ast_manager>&) ../src/ast/rewriter/arith_rewriter.cpp:1213
    #7 0x564da936241f in arith_rewriter::mk_app_core(func_decl*, unsigned int, expr* const*, obj_ref<expr, ast_manager>&) ../src/ast/rewriter/arith_rewriter.cpp:76
    #8 0x564da9222329 in reduce_app_core ../src/ast/rewriter/th_rewriter.cpp:207
    #9 0x564da922573f in reduce_app ../src/ast/rewriter/th_rewriter.cpp:618
    #10 0x564da9234740 in process_app<false> ../src/ast/rewriter/rewriter_def.h:291
    #11 0x564da922e1e8 in resume_core<false> ../src/ast/rewriter/rewriter_def.h:754
    #12 0x564da922cddd in main_loop<false> ../src/ast/rewriter/rewriter_def.h:713
    #13 0x564da922b34c in operator() ../src/ast/rewriter/rewriter_def.h:785
    #14 0x564da9228660 in th_rewriter::operator()(expr*, obj_ref<expr, ast_manager>&, obj_ref<app, ast_manager>&) ../src/ast/rewriter/th_rewriter.cpp:857
    #15 0x564da9055177 in simplify_cmd::execute(cmd_context&) ../src/cmd_context/simplify_cmd.cpp:89
    #16 0x564da8f97f96 in smt2::parser::parse_ext_cmd(int, int) ../src/parsers/smt2/smt2parser.cpp:2895
    #17 0x564da8f98887 in smt2::parser::parse_cmd() ../src/parsers/smt2/smt2parser.cpp:3001
    #18 0x564da8f99d0d in smt2::parser::operator()() ../src/parsers/smt2/smt2parser.cpp:3130
    #19 0x564da8f73c94 in parse_smt2_commands(cmd_context&, std::istream&, bool, params_ref const&, char const*) ../src/parsers/smt2/smt2parser.cpp:3179
    #20 0x564da77118d9 in read_smtlib2_commands(char const*) ../src/shell/smtlib_frontend.cpp:89
    #21 0x564da7733353 in main ../src/shell/main.cpp:352
    #22 0x7fc74e2d3b96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)
    #23 0x564da76f9499 in _start (/home/suz/software/z3san/build-03252020030602-a749587/z3+0x606499)

Address 0x7ffd78f67fd0 is located in stack of thread T0 at offset 32 in frame
    #0 0x564da9d5e147 in algebraic_numbers::manager::imp::is_rational(algebraic_numbers::anum&) (/home/suz/software/z3san/build-03252020030602-a749587/z3+0x2c6b147)

  This frame has 7 object(s):
    [32, 40) '<unknown>' <== Memory access at offset 32 is inside this variable
    [96, 120) 'zcandidate'
    [160, 192) 'a_n_lower'
    [224, 256) 'a_n_upper'
    [288, 328) 'candidate'
    [384, 456) 'saved_a'
    [512, 544) '<unknown>'
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-use-after-scope ../src/math/polynomial/algebraic_numbers.h:372 in algebraic_numbers::anum::is_basic() const
Shadow bytes around the buggy address:
  0x10002f1e4fa0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10002f1e4fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10002f1e4fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10002f1e4fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10002f1e4fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x10002f1e4ff0: 00 00 00 00 00 00 f1 f1 f1 f1[f8]f2 f2 f2 f2 f2
  0x10002f1e5000: f2 f2 00 00 00 f2 f2 f2 f2 f2 00 00 00 00 f2 f2
  0x10002f1e5010: f2 f2 00 00 00 00 f2 f2 f2 f2 00 00 00 00 00 f2
  0x10002f1e5020: f2 f2 f2 f2 f2 f2 00 00 00 00 00 00 00 00 00 f2
  0x10002f1e5030: f2 f2 f2 f2 f2 f2 00 00 00 00 f3 f3 f3 f3 00 00
  0x10002f1e5040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==15537==ABORTING
[552] % 
NikolajBjorner commented 4 years ago

As far as I can see it accesses a tagged pointer and the tool is therefore complaining. A review of the particular code, when exercised with the example, did not expose any issue.