Closed jurajsic closed 6 months ago
Benchmark sygus_qgen/to120.csv
# of formulae: 343
--------------------------------------------------
tool ✅ ❌ avg med time sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ----- ----- ------ ----- ------- --------- ---- -------- -------
z3-noodler-3524a85-2cddb2f 343 0 0.01 0.01 4.78 343 0 0 0 0 0
cvc5-1.1.2 343 0 0.23 0.04 79.25 343 0 0 0 0 0
z3-4.12.6 343 0 0.03 0.02 10.74 343 0 0 0 0 0
z3-4.13.0 343 0 0.03 0.02 10.38 343 0 0 0 0 0
z3-noodler-0751e1e-2cddb2f 343 0 0.02 0.02 5.28 343 0 0 0 0 0
--------------------------------------------------
Benchmark denghang/to120.csv
# of formulae: 999
--------------------------------------------------
tool ✅ ❌ avg med time sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ----- ----- ------ ----- ------- --------- ---- -------- -------
z3-noodler-3524a85-2cddb2f 999 0 0.04 0.02 42.76 186 813 0 0 0 0
cvc5-1.1.2 981 18 0.30 0.00 292.35 169 812 0 18 0 0
z3-4.12.6 883 116 0.38 0.09 338.60 186 697 0 105 11 0
z3-4.13.0 883 116 0.39 0.09 342.20 186 697 0 105 11 0
z3-noodler-0751e1e-2cddb2f 999 0 0.04 0.02 44.34 186 813 0 0 0 0
--------------------------------------------------
Benchmark automatark/to120.csv
# of formulae: 15995
--------------------------------------------------
tool ✅ ❌ avg med time sat unsat unknown TO MO+ERR other
-------------------------- ----- ---- ----- ----- ------- ----- ------- --------- ---- -------- -------
z3-noodler-3524a85-2cddb2f 15935 60 0.12 0.01 1986.14 10452 5483 0 60 0 0
cvc5-1.1.2 15902 93 0.14 0.00 2216.97 10460 5442 0 85 8 0
z3-4.12.6 15869 126 0.39 0.02 6250.34 10469 5400 0 125 1 0
z3-4.13.0 15870 125 0.40 0.02 6425.06 10469 5401 0 124 1 0
z3-noodler-0751e1e-2cddb2f 15935 60 0.12 0.02 1975.62 10452 5483 0 60 0 0
--------------------------------------------------
Benchmark stringfuzz/to120.csv
# of formulae: 11618
--------------------------------------------------
tool ✅ ❌ avg med time sat unsat unknown TO MO+ERR other
-------------------------- ----- ---- ----- ----- -------- ----- ------- --------- ---- -------- -------
z3-noodler-3524a85-2cddb2f 11614 4 0.04 0.01 434.97 6353 5261 2 2 0 0
cvc5-1.1.2 10915 703 2.81 0.00 30628.85 6158 4757 0 702 1 0
z3-4.12.6 11078 540 4.09 0.01 45270.19 5992 5086 0 415 125 0
z3-4.13.0 11081 537 4.14 0.01 45873.06 5995 5086 0 412 125 0
z3-noodler-0751e1e-2cddb2f 11616 2 0.03 0.01 350.71 6355 5261 2 0 0 0
--------------------------------------------------
Benchmark norn/to120.csv
# of formulae: 1027
--------------------------------------------------
tool ✅ ❌ avg med time sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ----- ----- ------ ----- ------- --------- ---- -------- -------
z3-noodler-3524a85-2cddb2f 1027 0 0.01 0.01 14.30 712 315 0 0 0 0
cvc5-1.1.2 943 84 0.30 0.01 279.95 705 238 0 84 0 0
z3-4.12.6 904 123 0.33 0.02 294.15 712 192 0 123 0 0
z3-4.13.0 903 124 0.19 0.02 174.62 712 191 0 124 0 0
z3-noodler-0751e1e-2cddb2f 1027 0 0.02 0.01 15.69 712 315 0 0 0 0
--------------------------------------------------
Benchmark slog/to120.csv
# of formulae: 1976
--------------------------------------------------
tool ✅ ❌ avg med time sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ----- ----- ------ ----- ------- --------- ---- -------- -------
z3-noodler-3524a85-2cddb2f 1976 0 0.01 0.01 27.63 808 1168 0 0 0 0
cvc5-1.1.2 1976 0 0.00 0.00 1.10 808 1168 0 0 0 0
z3-4.12.6 1947 29 0.36 0.01 694.80 779 1168 0 29 0 0
z3-4.13.0 1945 31 0.25 0.01 488.71 777 1168 0 31 0 0
z3-noodler-0751e1e-2cddb2f 1976 0 0.02 0.01 30.00 808 1168 0 0 0 0
--------------------------------------------------
Benchmark slent/to120.csv
# of formulae: 1128
--------------------------------------------------
tool ✅ ❌ avg med time sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ----- ----- ------- ----- ------- --------- ---- -------- -------
z3-noodler-3524a85-2cddb2f 1126 2 0.03 0.01 30.81 629 497 1 1 0 0
cvc5-1.1.2 1104 24 0.98 0.00 1079.50 618 486 0 24 0 0
z3-4.12.6 1055 73 0.36 0.01 382.24 571 484 0 73 0 0
z3-4.13.0 1055 73 0.37 0.01 385.56 571 484 0 73 0 0
z3-noodler-0751e1e-2cddb2f 1127 1 0.05 0.01 54.87 629 498 1 0 0 0
--------------------------------------------------
Benchmark transducer_plus/to120.csv
# of formulae: 91
--------------------------------------------------
tool ✅ ❌ avg med time sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ------ ------ ------ ----- ------- --------- ---- -------- -------
z3-noodler-3524a85-2cddb2f 0 91 nan nan 0.00 0 0 91 0 0 0
cvc5-1.1.2 42 49 11.68 3.24 490.70 41 1 0 49 0 0
z3-4.12.6 1 90 0.02 0.02 0.02 0 1 90 0 0 0
z3-4.13.0 1 90 0.01 0.01 0.01 0 1 90 0 0 0
z3-noodler-0751e1e-2cddb2f 0 91 nan nan 0.00 0 0 91 0 0 0
--------------------------------------------------
Benchmark kepler/to120.csv
# of formulae: 587
--------------------------------------------------
tool ✅ ❌ avg med time sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ----- ----- ------ ----- ------- --------- ---- -------- -------
z3-noodler-3524a85-2cddb2f 584 3 0.03 0.01 16.25 284 300 0 3 0 0
cvc5-1.1.2 347 240 0.00 0.00 0.54 50 297 0 240 0 0
z3-4.12.6 278 309 0.09 0.02 25.14 135 143 0 285 24 0
z3-4.13.0 278 309 0.09 0.01 24.64 135 143 0 285 24 0
z3-noodler-0751e1e-2cddb2f 584 3 0.03 0.01 17.44 284 300 0 3 0 0
--------------------------------------------------
Benchmark woorpje/to120.csv
# of formulae: 809
--------------------------------------------------
tool ✅ ❌ avg med time sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ----- ----- ------ ----- ------- --------- ---- -------- -------
z3-noodler-3524a85-2cddb2f 748 61 0.43 0.02 322.01 587 161 0 20 41 0
cvc5-1.1.2 755 54 0.06 0.01 45.21 591 164 0 54 0 0
z3-4.12.6 782 27 0.38 0.02 296.29 618 164 0 27 0 0
z3-4.13.0 782 27 0.38 0.02 300.36 618 164 0 27 0 0
z3-noodler-0751e1e-2cddb2f 750 59 0.58 0.02 438.50 589 161 0 20 39 0
--------------------------------------------------
Benchmark webapp/to120.csv
# of formulae: 681
--------------------------------------------------
tool ✅ ❌ avg med time sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ----- ----- ------ ----- ------- --------- ---- -------- -------
z3-noodler-3524a85-2cddb2f 355 326 0.19 0.02 68.26 130 225 316 6 4 0
cvc5-1.1.2 588 93 1.47 0.01 864.69 165 423 0 93 0 0
z3-4.12.6 449 232 0.46 0.02 207.29 35 414 120 112 0 0
z3-4.13.0 450 231 0.69 0.02 312.63 36 414 120 111 0 0
z3-noodler-0751e1e-2cddb2f 357 324 0.20 0.02 69.90 132 225 316 4 4 0
--------------------------------------------------
Benchmark kaluza/to120.csv
# of formulae: 19432
--------------------------------------------------
tool ✅ ❌ avg med time sat unsat unknown TO MO+ERR other
-------------------------- ----- ---- ----- ----- ------- ----- ------- --------- ---- -------- -------
z3-noodler-3524a85-2cddb2f 19103 329 0.05 0.01 1041.16 15999 3104 0 323 6 0
cvc5-1.1.2 19431 1 0.05 0.00 1061.97 16302 3129 0 1 0 0
z3-4.12.6 19148 284 0.11 0.01 2014.35 16032 3116 0 284 0 0
z3-4.13.0 19148 284 0.11 0.01 2019.12 16032 3116 0 284 0 0
z3-noodler-0751e1e-2cddb2f 19162 270 0.06 0.01 1138.96 16065 3097 0 262 8 0
--------------------------------------------------
Benchmark leetcode/to120.csv
# of formulae: 2652
--------------------------------------------------
tool ✅ ❌ avg med time sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ----- ----- ------ ----- ------- --------- ---- -------- -------
z3-noodler-3524a85-2cddb2f 2648 4 0.02 0.01 53.36 864 1784 0 0 4 0
cvc5-1.1.2 2652 0 0.01 0.00 32.91 867 1785 0 0 0 0
z3-4.12.6 2652 0 0.02 0.01 42.90 867 1785 0 0 0 0
z3-4.13.0 2652 0 0.02 0.01 43.44 867 1785 0 0 0 0
z3-noodler-0751e1e-2cddb2f 2648 4 0.02 0.01 57.49 864 1784 0 0 4 0
--------------------------------------------------
Benchmark str_small_rw/to120.csv
# of formulae: 1880
--------------------------------------------------
tool ✅ ❌ avg med time sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ----- ----- ------ ----- ------- --------- ---- -------- -------
z3-noodler-3524a85-2cddb2f 1746 134 0.17 0.01 291.85 4 1742 102 22 10 0
cvc5-1.1.2 1861 19 0.02 0.00 31.09 20 1841 2 17 0 0
z3-4.12.6 1821 59 0.11 0.01 203.86 25 1796 0 59 0 0
z3-4.13.0 1821 59 0.12 0.01 214.88 25 1796 0 59 0 0
z3-noodler-0751e1e-2cddb2f 1743 137 0.14 0.01 252.41 4 1739 100 23 14 0
--------------------------------------------------
Benchmark pyex/to120.csv
# of formulae: 23845
--------------------------------------------------
tool ✅ ❌ avg med time sat unsat unknown TO MO+ERR other
-------------------------- ----- ---- ----- ----- -------- ----- ------- --------- ---- -------- -------
z3-noodler-3524a85-2cddb2f 23705 140 0.20 0.11 4696.42 20331 3374 0 117 23 0
cvc5-1.1.2 23826 19 0.27 0.07 6384.00 20406 3420 0 19 0 0
z3-4.12.6 22859 986 2.37 0.12 54241.37 19588 3271 0 986 0 0
z3-4.13.0 22858 987 2.40 0.12 54817.37 19587 3271 0 987 0 0
z3-noodler-0751e1e-2cddb2f 23751 94 0.24 0.11 5617.14 20380 3371 0 93 1 0
--------------------------------------------------
Benchmark full_str_int/to120.csv
# of formulae: 16968
--------------------------------------------------
tool ✅ ❌ avg med time sat unsat unknown TO MO+ERR other
-------------------------- ----- ---- ----- ----- -------- ----- ------- --------- ---- -------- -------
z3-noodler-3524a85-2cddb2f 16704 264 0.16 0.01 2626.29 2438 14266 126 137 1 0
cvc5-1.1.2 16963 5 0.30 0.01 5095.10 2521 14442 0 5 0 0
z3-4.12.6 16730 238 1.20 0.01 19997.34 2474 14256 0 238 0 0
z3-4.13.0 16729 239 1.21 0.01 20278.49 2473 14256 0 239 0 0
z3-noodler-0751e1e-2cddb2f 16704 264 0.19 0.01 3215.44 2436 14268 126 138 0 0
--------------------------------------------------
There is degradation in pyex
and kaluza
. Possible culprit is the new rewriter rules, or just pure randomness.
Can you post the scatter plots as well?
I commented one of the new rewriter rules, it was adding some length constraints which probably made some vars length-aware. Pyex now:
Benchmark pyex/to120.csv
# of formulae: 23845
--------------------------------------------------
tool ✅ ❌ avg med time sat unsat unknown TO MO+ERR other
-------------------------- ----- ---- ----- ----- -------- ----- ------- --------- ---- -------- -------
z3-noodler-3233733-2cddb2f 23752 93 0.20 0.10 4805.20 20381 3371 0 91 2 0
cvc5-1.1.2 23826 19 0.27 0.07 6384.00 20406 3420 0 19 0 0
z3-4.12.6 22859 986 2.37 0.12 54241.37 19588 3271 0 986 0 0
z3-4.13.0 22858 987 2.40 0.12 54817.37 19587 3271 0 987 0 0
z3-noodler-0751e1e-2cddb2f 23751 94 0.24 0.11 5617.14 20380 3371 0 93 1 0
--------------------------------------------------
There is one incorrect result in full_str_int
now, on QF_SLIA/2019-full_str_int/py-conbyte_trauc/leetcode_int-numDecodings/130.smt2
we give sat, while cvc5 an Z3 give unsat.
On the left is v1.1.
I updated mata, we do not give incorrect result anymore (instead we give unknown
). It is impossible to debug, so I would merge it and hope that if the bug shows up again, it will be more debuggable.
Updates base Z3 to v4.13.0. I also updated
README.md
andREADME-Z3.md
.