Closed jurajsic closed 1 month ago
Results:
# of formulae: 103405
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ------ ---- --------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-model-ab88926-2756940 102389 1016 11413.13 0.11 0.02 1.26 62904 39485 548 412 0 56
z3-noodler-ab88926-2756940 102443 962 12031.47 0.12 0.02 1.63 62955 39488 530 364 68 0
z3-noodler-7525ba0-2756940 102442 963 11912.81 0.12 0.02 1.61 62954 39488 530 364 69 0
cvc5-1.1.2 99774 3631 76479.19 0.77 0.00 6.77 60870 38904 2 3620 9 0
z3-4.13.0 97745 5660 128712.47 1.32 0.01 7.95 58927 38818 211 4914 465 70
--------------------------------------------------
Benchmark sygus_qgen
# of formulae: 343
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-model-ab88926-2756940 343 0 3.95 0.01 0.01 0.00 343 0 0 0 0 0
z3-noodler-ab88926-2756940 343 0 4.41 0.01 0.01 0.01 343 0 0 0 0 0
z3-noodler-7525ba0-2756940 343 0 4.65 0.01 0.01 0.01 343 0 0 0 0 0
cvc5-1.1.2 343 0 79.25 0.23 0.04 0.94 343 0 0 0 0 0
z3-4.13.0 343 0 10.35 0.03 0.02 0.02 343 0 0 0 0 0
--------------------------------------------------
Benchmark denghang
# of formulae: 999
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-model-ab88926-2756940 999 0 39.99 0.04 0.02 0.57 186 813 0 0 0 0
z3-noodler-ab88926-2756940 999 0 42.44 0.04 0.02 0.60 186 813 0 0 0 0
z3-noodler-7525ba0-2756940 999 0 42.64 0.04 0.02 0.60 186 813 0 0 0 0
cvc5-1.1.2 981 18 292.35 0.30 0.00 3.71 169 812 0 18 0 0
z3-4.13.0 883 116 326.28 0.37 0.08 2.10 186 697 0 105 11 0
--------------------------------------------------
Benchmark automatark
# of formulae: 15995
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ----- ---- ------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-model-ab88926-2756940 15990 5 732.88 0.05 0.01 0.49 10477 5513 0 5 0 0
z3-noodler-ab88926-2756940 15990 5 744.40 0.05 0.01 0.52 10477 5513 0 5 0 0
z3-noodler-7525ba0-2756940 15990 5 742.44 0.05 0.01 0.51 10477 5513 0 5 0 0
cvc5-1.1.2 15902 93 2216.97 0.14 0.00 2.83 10460 5442 0 85 8 0
z3-4.13.0 15871 124 6107.03 0.38 0.02 3.72 10469 5402 0 123 1 0
--------------------------------------------------
Benchmark stringfuzz
# of formulae: 11618
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ----- ---- -------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-model-ab88926-2756940 11614 4 306.06 0.03 0.01 0.17 6352 5262 1 2 0 1
z3-noodler-ab88926-2756940 11616 2 322.88 0.03 0.01 0.18 6354 5262 1 1 0 0
z3-noodler-7525ba0-2756940 11616 2 322.54 0.03 0.01 0.18 6354 5262 1 1 0 0
cvc5-1.1.2 10915 703 30628.85 2.81 0.00 12.55 6158 4757 0 702 1 0
z3-4.13.0 11090 528 44638.05 4.03 0.01 14.89 6004 5086 0 403 125 0
--------------------------------------------------
Benchmark redos
# of formulae: 3287
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ---- ---- -------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-model-ab88926-2756940 3278 9 2189.90 0.67 0.03 1.52 2438 840 0 9 0 0
z3-noodler-ab88926-2756940 3285 2 2626.34 0.80 0.03 3.74 2445 840 0 2 0 0
z3-noodler-7525ba0-2756940 3285 2 2618.90 0.80 0.03 3.75 2445 840 0 2 0 0
cvc5-1.1.2 1070 2217 27894.06 26.07 0.03 38.26 574 496 0 2217 0 0
z3-4.13.0 888 2399 297.55 0.34 0.01 2.60 48 840 0 2099 300 0
--------------------------------------------------
Benchmark norn
# of formulae: 1027
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-model-ab88926-2756940 1027 0 13.80 0.01 0.01 0.01 712 315 0 0 0 0
z3-noodler-ab88926-2756940 1027 0 14.28 0.01 0.01 0.01 712 315 0 0 0 0
z3-noodler-7525ba0-2756940 1027 0 14.16 0.01 0.01 0.01 712 315 0 0 0 0
cvc5-1.1.2 943 84 279.95 0.30 0.01 2.77 705 238 0 84 0 0
z3-4.13.0 904 123 278.96 0.31 0.02 5.21 712 192 0 123 0 0
--------------------------------------------------
Benchmark slog
# of formulae: 1976
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-model-ab88926-2756940 1976 0 25.30 0.01 0.01 0.02 808 1168 0 0 0 0
z3-noodler-ab88926-2756940 1976 0 28.06 0.01 0.01 0.03 808 1168 0 0 0 0
z3-noodler-7525ba0-2756940 1976 0 27.98 0.01 0.01 0.02 808 1168 0 0 0 0
cvc5-1.1.2 1976 0 1.10 0.00 0.00 0.00 808 1168 0 0 0 0
z3-4.13.0 1956 20 539.58 0.28 0.01 3.81 788 1168 0 20 0 0
--------------------------------------------------
Benchmark slent
# of formulae: 1128
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ---- ---- ------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-model-ab88926-2756940 1112 16 111.21 0.10 0.02 1.50 614 498 11 5 0 0
z3-noodler-ab88926-2756940 1125 3 194.57 0.17 0.01 3.01 627 498 0 3 0 0
z3-noodler-7525ba0-2756940 1125 3 193.95 0.17 0.01 3.01 627 498 0 3 0 0
cvc5-1.1.2 1104 24 1079.50 0.98 0.00 6.51 618 486 0 24 0 0
z3-4.13.0 1052 76 345.30 0.33 0.01 4.17 571 481 0 76 0 0
--------------------------------------------------
Benchmark omark
# of formulae: 17
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-model-ab88926-2756940 8 9 0.26 0.03 0.02 0.05 1 7 1 7 0 1
z3-noodler-ab88926-2756940 9 8 0.13 0.01 0.01 0.01 2 7 0 6 2 0
z3-noodler-7525ba0-2756940 9 8 0.13 0.01 0.01 0.01 2 7 0 6 2 0
cvc5-1.1.2 5 12 0.95 0.19 0.01 0.41 2 3 0 12 0 0
z3-4.13.0 7 10 8.06 1.15 0.01 3.01 2 5 0 8 2 0
--------------------------------------------------
Benchmark kepler
# of formulae: 587
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-model-ab88926-2756940 584 3 49.75 0.09 0.01 0.17 284 300 0 3 0 0
z3-noodler-ab88926-2756940 584 3 16.03 0.03 0.01 0.07 284 300 0 3 0 0
z3-noodler-7525ba0-2756940 584 3 15.70 0.03 0.01 0.07 284 300 0 3 0 0
cvc5-1.1.2 347 240 0.54 0.00 0.00 0.00 50 297 0 240 0 0
z3-4.13.0 278 309 25.95 0.09 0.02 0.90 135 143 0 283 26 0
--------------------------------------------------
Benchmark woorpje
# of formulae: 809
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-model-ab88926-2756940 769 40 103.44 0.13 0.02 1.42 607 162 2 30 0 8
z3-noodler-ab88926-2756940 776 33 411.92 0.53 0.01 6.15 614 162 0 14 19 0
z3-noodler-7525ba0-2756940 775 34 299.84 0.39 0.01 4.71 613 162 0 14 20 0
cvc5-1.1.2 755 54 45.21 0.06 0.01 0.23 591 164 0 54 0 0
z3-4.13.0 782 27 295.18 0.38 0.02 4.74 618 164 0 27 0 0
--------------------------------------------------
Benchmark webapp
# of formulae: 681
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-model-ab88926-2756940 357 324 13.62 0.04 0.02 0.05 129 228 318 1 0 5
z3-noodler-ab88926-2756940 361 320 17.69 0.05 0.02 0.24 133 228 315 0 5 0
z3-noodler-7525ba0-2756940 361 320 17.41 0.05 0.02 0.23 133 228 315 0 5 0
cvc5-1.1.2 588 93 864.69 1.47 0.01 9.96 165 423 0 93 0 0
z3-4.13.0 451 230 317.75 0.70 0.02 6.71 37 414 121 109 0 0
--------------------------------------------------
Benchmark kaluza
# of formulae: 19432
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ----- ---- ------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-model-ab88926-2756940 19236 196 505.40 0.03 0.01 0.41 16143 3093 0 192 0 4
z3-noodler-ab88926-2756940 19247 185 1295.24 0.07 0.01 1.98 16151 3096 0 172 13 0
z3-noodler-7525ba0-2756940 19247 185 1307.88 0.07 0.01 2.01 16151 3096 0 172 13 0
cvc5-1.1.2 19431 1 1061.97 0.05 0.00 1.87 16302 3129 0 1 0 0
z3-4.13.0 19139 293 1428.86 0.07 0.01 1.71 16022 3117 0 293 0 0
--------------------------------------------------
Benchmark transducer_plus
# of formulae: 91
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ---- ---- ------ ------ ------ ------ ----- ------- --------- ---- -------- -------
z3-noodler-model-ab88926-2756940 0 91 0.00 nan nan nan 0 0 91 0 0 0
z3-noodler-ab88926-2756940 0 91 0.00 nan nan nan 0 0 91 0 0 0
z3-noodler-7525ba0-2756940 0 91 0.00 nan nan nan 0 0 91 0 0 0
cvc5-1.1.2 42 49 490.70 11.68 3.24 22.98 41 1 0 49 0 0
z3-4.13.0 1 90 0.01 0.01 0.01 nan 0 1 90 0 0 0
--------------------------------------------------
Benchmark leetcode
# of formulae: 2652
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-model-ab88926-2756940 2641 11 68.47 0.03 0.02 0.05 857 1784 0 0 0 11
z3-noodler-ab88926-2756940 2648 4 51.95 0.02 0.01 0.05 864 1784 0 0 4 0
z3-noodler-7525ba0-2756940 2648 4 51.82 0.02 0.01 0.05 864 1784 0 0 4 0
cvc5-1.1.2 2652 0 32.91 0.01 0.00 0.03 867 1785 0 0 0 0
z3-4.13.0 2652 0 41.89 0.02 0.01 0.11 867 1785 0 0 0 0
--------------------------------------------------
Benchmark str_small_rw
# of formulae: 1880
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-model-ab88926-2756940 1813 67 422.36 0.23 0.02 3.22 3 1810 7 35 0 25
z3-noodler-ab88926-2756940 1814 66 400.57 0.22 0.01 3.10 4 1810 6 36 24 0
z3-noodler-7525ba0-2756940 1814 66 395.42 0.22 0.01 3.08 4 1810 6 36 24 0
cvc5-1.1.2 1861 19 31.09 0.02 0.00 0.56 20 1841 2 17 0 0
z3-4.13.0 1821 59 202.53 0.11 0.01 2.66 25 1796 0 59 0 0
--------------------------------------------------
Benchmark pyex
# of formulae: 23845
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ----- ---- -------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-model-ab88926-2756940 23750 95 5191.12 0.22 0.12 1.73 20379 3371 0 95 0 0
z3-noodler-ab88926-2756940 23751 94 4553.64 0.19 0.10 1.58 20380 3371 0 94 0 0
z3-noodler-7525ba0-2756940 23751 94 4551.27 0.19 0.10 1.60 20380 3371 0 94 0 0
cvc5-1.1.2 23826 19 6384.00 0.27 0.07 2.48 20406 3420 0 19 0 0
z3-4.13.0 22895 950 55228.54 2.41 0.12 10.38 19624 3271 0 950 0 0
--------------------------------------------------
Benchmark full_str_int
# of formulae: 16968
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ----- ---- -------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-model-ab88926-2756940 16822 146 1634.84 0.10 0.02 1.73 2501 14321 117 28 0 1
z3-noodler-ab88926-2756940 16822 146 1306.10 0.08 0.01 1.30 2501 14321 117 28 1 0
z3-noodler-7525ba0-2756940 16822 146 1305.33 0.08 0.01 1.30 2501 14321 117 28 1 0
cvc5-1.1.2 16963 5 5095.10 0.30 0.01 1.42 2521 14442 0 5 0 0
z3-4.13.0 16732 236 18620.60 1.11 0.01 6.38 2476 14256 0 236 0 0
--------------------------------------------------
Benchmark snia
# of formulae: 70
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ---- ---- ------ ------ ------ ------ ----- ------- --------- ---- -------- -------
z3-noodler-model-ab88926-2756940 70 0 0.78 0.01 0.01 0.00 70 0 0 0 0 0
z3-noodler-ab88926-2756940 70 0 0.82 0.01 0.01 0.01 70 0 0 0 0 0
z3-noodler-7525ba0-2756940 70 0 0.75 0.01 0.01 0.01 70 0 0 0 0 0
cvc5-1.1.2 70 0 0.00 0.00 0.00 0.00 70 0 0 0 0 0
z3-4.13.0 0 70 0.00 nan nan nan 0 0 0 0 0 70
--------------------------------------------------
There is some bug, but I do not think it is related to model generation. For QF_SLIA/2019-full_str_int/py-conbyte_z3seq/leetcode_int-validWordAbbreviation/132.smt2
, if we enable model generation we give sat
, if we do not enable it, we have TO
, and Z3
and cvc5
both give unsat
. Also for QF_SLIA/2019-full_str_int/py-conbyte_z3seq/leetcode_int-validWordAbbreviation/162.smt2
, we give (correctly) sat
, but the model is invalid. I think these two issues are related, and they are probably also related to #135
But I would first merge this, and then try sorting out the bug.
Nice, ready to merge from my side.
This PR adds model generation from regex directly in membership heuristic (and if not possible, try to create NFA and get model from that) + multiple optimization:
sat
, check if we can limit arith model of lengths by some number (currently 100), so that we do not get some large models for which generating model is hardzstring
concatenations in a loop by a vector from which a resultingzstring
is computed (for large loop bounds,zstring
concatenation is very slow, it constantly creates new vectors)