Closed jurajsic closed 1 month ago
The impact of changes (with models turned off):
# of formulae: 103405
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ------ ---- --------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-41c6e48-7c3f5d0 102380 1025 7811.82 0.08 0.01 0.69 62888 39492 528 469 28 0
z3-noodler-9dacadd-bdcb62d 102393 1012 11618.30 0.11 0.02 1.53 62911 39482 537 409 66 0
cvc5-1.1.2 99774 3631 76479.19 0.77 0.00 6.77 60870 38904 2 3620 9 0
z3-4.13.0 97694 5711 132044.66 1.35 0.01 8.11 58876 38818 210 4968 463 70
--------------------------------------------------
Benchmark sygus_qgen
# of formulae: 343
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-41c6e48-7c3f5d0 343 0 2.96 0.01 0.01 0.00 343 0 0 0 0 0
z3-noodler-9dacadd-bdcb62d 343 0 4.57 0.01 0.01 0.01 343 0 0 0 0 0
cvc5-1.1.2 343 0 79.25 0.23 0.04 0.94 343 0 0 0 0 0
z3-4.13.0 343 0 10.38 0.03 0.02 0.02 343 0 0 0 0 0
--------------------------------------------------
Benchmark denghang
# of formulae: 999
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-41c6e48-7c3f5d0 999 0 35.67 0.04 0.01 0.57 186 813 0 0 0 0
z3-noodler-9dacadd-bdcb62d 999 0 41.76 0.04 0.02 0.59 186 813 0 0 0 0
cvc5-1.1.2 981 18 292.35 0.30 0.00 3.71 169 812 0 18 0 0
z3-4.13.0 883 116 342.20 0.39 0.09 2.17 186 697 0 105 11 0
--------------------------------------------------
Benchmark automatark
# of formulae: 15995
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ----- ---- ------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-41c6e48-7c3f5d0 15990 5 442.43 0.03 0.01 0.28 10477 5513 0 5 0 0
z3-noodler-9dacadd-bdcb62d 15990 5 537.15 0.03 0.01 0.30 10477 5513 0 5 0 0
cvc5-1.1.2 15902 93 2216.97 0.14 0.00 2.83 10460 5442 0 85 8 0
z3-4.13.0 15870 125 6425.06 0.40 0.02 3.90 10469 5401 0 124 1 0
--------------------------------------------------
Benchmark stringfuzz
# of formulae: 11618
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ----- ---- -------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-41c6e48-7c3f5d0 11616 2 201.22 0.02 0.01 0.16 6354 5262 1 1 0 0
z3-noodler-9dacadd-bdcb62d 11616 2 265.10 0.02 0.01 0.17 6354 5262 1 1 0 0
cvc5-1.1.2 10915 703 30628.85 2.81 0.00 12.55 6158 4757 0 702 1 0
z3-4.13.0 11081 537 45873.06 4.14 0.01 15.39 5995 5086 0 412 125 0
--------------------------------------------------
Benchmark redos
# of formulae: 3287
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- -------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-41c6e48-7c3f5d0 3281 6 1396.81 0.43 0.02 1.54 2441 840 0 6 0 0
z3-noodler-9dacadd-bdcb62d 3285 2 2633.42 0.80 0.03 3.77 2445 840 0 2 0 0
cvc5-1.1.2 1070 2217 27894.06 26.07 0.03 38.26 574 496 0 2217 0 0
z3-4.13.0 888 2399 326.46 0.37 0.01 2.85 48 840 0 2099 300 0
--------------------------------------------------
Benchmark norn
# of formulae: 1027
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-41c6e48-7c3f5d0 1027 0 9.41 0.01 0.01 0.00 712 315 0 0 0 0
z3-noodler-9dacadd-bdcb62d 1027 0 14.97 0.01 0.01 0.01 712 315 0 0 0 0
cvc5-1.1.2 943 84 279.95 0.30 0.01 2.77 705 238 0 84 0 0
z3-4.13.0 903 124 174.62 0.19 0.02 3.88 712 191 0 124 0 0
--------------------------------------------------
Benchmark slog
# of formulae: 1976
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-41c6e48-7c3f5d0 1976 0 18.89 0.01 0.01 0.02 808 1168 0 0 0 0
z3-noodler-9dacadd-bdcb62d 1976 0 28.66 0.01 0.01 0.02 808 1168 0 0 0 0
cvc5-1.1.2 1976 0 1.10 0.00 0.00 0.00 808 1168 0 0 0 0
z3-4.13.0 1945 31 488.71 0.25 0.01 3.22 777 1168 0 31 0 0
--------------------------------------------------
Benchmark slent
# of formulae: 1128
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-41c6e48-7c3f5d0 1123 5 118.81 0.11 0.01 1.80 625 498 0 5 0 0
z3-noodler-9dacadd-bdcb62d 1124 4 230.90 0.21 0.01 3.09 626 498 0 4 0 0
cvc5-1.1.2 1104 24 1079.50 0.98 0.00 6.51 618 486 0 24 0 0
z3-4.13.0 1055 73 385.56 0.37 0.01 4.01 571 484 0 73 0 0
--------------------------------------------------
Benchmark omark
# of formulae: 17
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-41c6e48-7c3f5d0 9 8 0.13 0.01 0.01 0.02 2 7 0 7 1 0
z3-noodler-9dacadd-bdcb62d 9 8 0.17 0.02 0.02 0.01 2 7 0 6 2 0
cvc5-1.1.2 5 12 0.95 0.19 0.01 0.41 2 3 0 12 0 0
z3-4.13.0 7 10 7.67 1.10 0.01 2.86 2 5 0 8 2 0
--------------------------------------------------
Benchmark kepler
# of formulae: 587
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-41c6e48-7c3f5d0 584 3 12.42 0.02 0.01 0.06 284 300 0 3 0 0
z3-noodler-9dacadd-bdcb62d 584 3 15.64 0.03 0.01 0.07 284 300 0 3 0 0
cvc5-1.1.2 347 240 0.54 0.00 0.00 0.00 50 297 0 240 0 0
z3-4.13.0 278 309 24.64 0.09 0.01 0.87 135 143 0 285 24 0
--------------------------------------------------
Benchmark woorpje
# of formulae: 809
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-41c6e48-7c3f5d0 773 36 155.90 0.20 0.01 1.93 611 162 0 34 2 0
z3-noodler-9dacadd-bdcb62d 774 35 265.04 0.34 0.01 4.03 612 162 0 18 17 0
cvc5-1.1.2 755 54 45.21 0.06 0.01 0.23 591 164 0 54 0 0
z3-4.13.0 782 27 300.36 0.38 0.02 4.80 618 164 0 27 0 0
--------------------------------------------------
Benchmark webapp
# of formulae: 681
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-41c6e48-7c3f5d0 363 318 10.57 0.03 0.01 0.05 135 228 315 1 2 0
z3-noodler-9dacadd-bdcb62d 363 318 15.77 0.04 0.02 0.08 135 228 316 0 2 0
cvc5-1.1.2 588 93 864.69 1.47 0.01 9.96 165 423 0 93 0 0
z3-4.13.0 450 231 312.63 0.69 0.02 6.78 36 414 120 111 0 0
--------------------------------------------------
Benchmark kaluza
# of formulae: 19432
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ----- ---- ------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-41c6e48-7c3f5d0 19211 221 556.74 0.03 0.01 0.58 16112 3099 0 215 6 0
z3-noodler-9dacadd-bdcb62d 19210 222 949.14 0.05 0.01 1.30 16114 3096 0 209 13 0
cvc5-1.1.2 19431 1 1061.97 0.05 0.00 1.87 16302 3129 0 1 0 0
z3-4.13.0 19148 284 2019.12 0.11 0.01 2.02 16032 3116 0 284 0 0
--------------------------------------------------
Benchmark transducer_plus
# of formulae: 91
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ------ ------ ------ ------ ----- ------- --------- ---- -------- -------
z3-noodler-41c6e48-7c3f5d0 0 91 0.00 nan nan nan 0 0 91 0 0 0
z3-noodler-9dacadd-bdcb62d 0 91 0.00 nan nan nan 0 0 91 0 0 0
cvc5-1.1.2 42 49 490.70 11.68 3.24 22.98 41 1 0 49 0 0
z3-4.13.0 1 90 0.01 0.01 0.01 nan 0 1 90 0 0 0
--------------------------------------------------
Benchmark leetcode
# of formulae: 2652
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-41c6e48-7c3f5d0 2649 3 37.45 0.01 0.01 0.04 865 1784 0 0 3 0
z3-noodler-9dacadd-bdcb62d 2649 3 55.18 0.02 0.01 0.07 865 1784 0 0 3 0
cvc5-1.1.2 2652 0 32.91 0.01 0.00 0.03 867 1785 0 0 0 0
z3-4.13.0 2652 0 43.44 0.02 0.01 0.12 867 1785 0 0 0 0
--------------------------------------------------
Benchmark str_small_rw
# of formulae: 1880
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-41c6e48-7c3f5d0 1815 65 175.31 0.10 0.01 1.10 4 1811 4 48 13 0
z3-noodler-9dacadd-bdcb62d 1816 64 354.82 0.20 0.01 2.91 4 1812 4 33 27 0
cvc5-1.1.2 1861 19 31.09 0.02 0.00 0.56 20 1841 2 17 0 0
z3-4.13.0 1821 59 214.88 0.12 0.01 2.86 25 1796 0 59 0 0
--------------------------------------------------
Benchmark pyex
# of formulae: 23845
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ----- ---- -------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-41c6e48-7c3f5d0 23746 99 3594.16 0.15 0.08 0.80 20375 3371 0 98 1 0
z3-noodler-9dacadd-bdcb62d 23753 92 4553.94 0.19 0.10 1.66 20382 3371 0 90 2 0
cvc5-1.1.2 23826 19 6384.00 0.27 0.07 2.48 20406 3420 0 19 0 0
z3-4.13.0 22858 987 54817.37 2.40 0.12 10.23 19587 3271 0 987 0 0
--------------------------------------------------
Benchmark full_str_int
# of formulae: 16968
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ----- ---- -------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-41c6e48-7c3f5d0 16805 163 1042.35 0.06 0.01 0.67 2484 14321 117 46 0 0
z3-noodler-9dacadd-bdcb62d 16805 163 1651.36 0.10 0.01 1.78 2492 14313 125 38 0 0
cvc5-1.1.2 16963 5 5095.10 0.30 0.01 1.42 2521 14442 0 5 0 0
z3-4.13.0 16729 239 20278.49 1.21 0.01 6.90 2473 14256 0 239 0 0
--------------------------------------------------
Benchmark snia
# of formulae: 70
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ------ ------ ------ ------ ----- ------- --------- ---- -------- -------
z3-noodler-41c6e48-7c3f5d0 70 0 0.59 0.01 0.01 0.00 70 0 0 0 0 0
z3-noodler-9dacadd-bdcb62d 70 0 0.71 0.01 0.01 0.00 70 0 0 0 0 0
cvc5-1.1.2 70 0 0.00 0.00 0.00 0.00 70 0 0 0 0 0
z3-4.13.0 0 70 0.00 nan nan nan 0 0 0 0 0 70
--------------------------------------------------
I think that by turning model generation off for LIA, we became faster. I will need to run Z3 without model generation to see how it behaves too.
Note that there are still some incorrect models and some bugs I need to handle, but I want to first merge all these changes.
This PR fixes a lot of stuff for model generation for the main decision, mainly:
to_code
/from_code
also can replace dummy symbol)skip_len_sat
preprocessing rule, the length vars needs to be propagated if we want to compute model (and the equation that the two sides have to have equal length was missing)sat
, we propagate the lengths of all lenght vars from the arith model, so that z3 does not do some stupid stuffWe also use new functions from mata (verifit/mata#420 and verifit/mata#424) in multiple-memberships heuristic to get models from language difference and complement.
I also added new trace output for preprocessing (
str-prep
) which will print the current rule that is gonna be used and after it was used, the current state of preprocessor (if you want to print NFAs in assignment, use alsostr-nfa
). This can be very useful in debugging preprocessing.I also changed the parameter
str.produce_model
to the parametermodel
, which is already in Z3 and I made it by defaultfalse
(it also turns off models for LIA solver). Finally, I madenoodler
the default parameter for string solver selection.