Closed jurajsic closed 3 months ago
The experiments are running, but looks like there are no more problems (except TOs and MEMOUTs) in model generation.
Results (with models turned off):
# of formulae: 103405
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ------ ---- --------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-86a65bd-2756940 102303 1102 19077.34 0.19 0.02 2.32 62822 39481 530 412 160 0
z3-noodler-7801978-2756940 102424 981 14358.39 0.14 0.02 1.99 62951 39473 530 378 73 0
cvc5-1.1.2 99774 3631 76479.19 0.77 0.00 6.77 60870 38904 2 3620 9 0
z3-4.13.0 97745 5660 128712.47 1.32 0.01 7.95 58927 38818 211 4914 465 70
--------------------------------------------------
Benchmark sygus_qgen
# of formulae: 343
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-86a65bd-2756940 343 0 4.57 0.01 0.01 0.01 343 0 0 0 0 0
z3-noodler-7801978-2756940 343 0 4.47 0.01 0.01 0.01 343 0 0 0 0 0
cvc5-1.1.2 343 0 79.25 0.23 0.04 0.94 343 0 0 0 0 0
z3-4.13.0 343 0 10.35 0.03 0.02 0.02 343 0 0 0 0 0
--------------------------------------------------
Benchmark denghang
# of formulae: 999
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-86a65bd-2756940 999 0 42.01 0.04 0.02 0.59 186 813 0 0 0 0
z3-noodler-7801978-2756940 999 0 41.79 0.04 0.02 0.58 186 813 0 0 0 0
cvc5-1.1.2 981 18 292.35 0.30 0.00 3.71 169 812 0 18 0 0
z3-4.13.0 883 116 326.28 0.37 0.08 2.10 186 697 0 105 11 0
--------------------------------------------------
Benchmark automatark
# of formulae: 15995
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ----- ---- ------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-86a65bd-2756940 15990 5 747.19 0.05 0.01 0.52 10477 5513 0 5 0 0
z3-noodler-7801978-2756940 15990 5 745.94 0.05 0.01 0.53 10477 5513 0 5 0 0
cvc5-1.1.2 15902 93 2216.97 0.14 0.00 2.83 10460 5442 0 85 8 0
z3-4.13.0 15871 124 6107.03 0.38 0.02 3.72 10469 5402 0 123 1 0
--------------------------------------------------
Benchmark stringfuzz
# of formulae: 11618
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ----- ---- -------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-86a65bd-2756940 11616 2 321.39 0.03 0.01 0.18 6354 5262 1 1 0 0
z3-noodler-7801978-2756940 11616 2 317.81 0.03 0.01 0.18 6354 5262 1 1 0 0
cvc5-1.1.2 10915 703 30628.85 2.81 0.00 12.55 6158 4757 0 702 1 0
z3-4.13.0 11090 528 44638.05 4.03 0.01 14.89 6004 5086 0 403 125 0
--------------------------------------------------
Benchmark redos
# of formulae: 3287
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- -------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-86a65bd-2756940 3285 2 2632.00 0.80 0.03 3.79 2445 840 0 2 0 0
z3-noodler-7801978-2756940 3285 2 2629.82 0.80 0.03 3.81 2445 840 0 2 0 0
cvc5-1.1.2 1070 2217 27894.06 26.07 0.03 38.26 574 496 0 2217 0 0
z3-4.13.0 888 2399 297.55 0.34 0.01 2.60 48 840 0 2099 300 0
--------------------------------------------------
Benchmark norn
# of formulae: 1027
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-86a65bd-2756940 1027 0 14.53 0.01 0.01 0.01 712 315 0 0 0 0
z3-noodler-7801978-2756940 1027 0 14.28 0.01 0.01 0.01 712 315 0 0 0 0
cvc5-1.1.2 943 84 279.95 0.30 0.01 2.77 705 238 0 84 0 0
z3-4.13.0 904 123 278.96 0.31 0.02 5.21 712 192 0 123 0 0
--------------------------------------------------
Benchmark slog
# of formulae: 1976
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-86a65bd-2756940 1976 0 27.88 0.01 0.01 0.02 808 1168 0 0 0 0
z3-noodler-7801978-2756940 1976 0 27.37 0.01 0.01 0.02 808 1168 0 0 0 0
cvc5-1.1.2 1976 0 1.10 0.00 0.00 0.00 808 1168 0 0 0 0
z3-4.13.0 1956 20 539.58 0.28 0.01 3.81 788 1168 0 20 0 0
--------------------------------------------------
Benchmark slent
# of formulae: 1128
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-86a65bd-2756940 1124 4 58.87 0.05 0.01 0.68 626 498 0 4 0 0
z3-noodler-7801978-2756940 1124 4 121.83 0.11 0.01 1.48 626 498 0 4 0 0
cvc5-1.1.2 1104 24 1079.50 0.98 0.00 6.51 618 486 0 24 0 0
z3-4.13.0 1052 76 345.30 0.33 0.01 4.17 571 481 0 76 0 0
--------------------------------------------------
Benchmark omark
# of formulae: 17
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-86a65bd-2756940 9 8 0.14 0.02 0.01 0.01 2 7 0 6 2 0
z3-noodler-7801978-2756940 9 8 0.15 0.02 0.01 0.02 2 7 0 6 2 0
cvc5-1.1.2 5 12 0.95 0.19 0.01 0.41 2 3 0 12 0 0
z3-4.13.0 7 10 8.06 1.15 0.01 3.01 2 5 0 8 2 0
--------------------------------------------------
Benchmark kepler
# of formulae: 587
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-86a65bd-2756940 584 3 15.75 0.03 0.01 0.07 284 300 0 3 0 0
z3-noodler-7801978-2756940 584 3 15.95 0.03 0.01 0.08 284 300 0 3 0 0
cvc5-1.1.2 347 240 0.54 0.00 0.00 0.00 50 297 0 240 0 0
z3-4.13.0 278 309 25.95 0.09 0.02 0.90 135 143 0 283 26 0
--------------------------------------------------
Benchmark woorpje
# of formulae: 809
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-86a65bd-2756940 774 35 205.60 0.27 0.01 2.47 612 162 0 15 20 0
z3-noodler-7801978-2756940 774 35 260.08 0.34 0.01 4.35 612 162 0 15 20 0
cvc5-1.1.2 755 54 45.21 0.06 0.01 0.23 591 164 0 54 0 0
z3-4.13.0 782 27 295.18 0.38 0.02 4.74 618 164 0 27 0 0
--------------------------------------------------
Benchmark webapp
# of formulae: 681
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-86a65bd-2756940 361 320 18.58 0.05 0.02 0.24 133 228 315 0 5 0
z3-noodler-7801978-2756940 361 320 17.78 0.05 0.02 0.24 133 228 315 0 5 0
cvc5-1.1.2 588 93 864.69 1.47 0.01 9.96 165 423 0 93 0 0
z3-4.13.0 451 230 317.75 0.70 0.02 6.71 37 414 121 109 0 0
--------------------------------------------------
Benchmark kaluza
# of formulae: 19432
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ----- ---- ------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-86a65bd-2756940 19141 291 1098.84 0.06 0.01 1.77 16039 3102 0 185 106 0
z3-noodler-7801978-2756940 19252 180 1626.19 0.08 0.01 2.37 16158 3094 0 169 11 0
cvc5-1.1.2 19431 1 1061.97 0.05 0.00 1.87 16302 3129 0 1 0 0
z3-4.13.0 19139 293 1428.86 0.07 0.01 1.71 16022 3117 0 293 0 0
--------------------------------------------------
Benchmark transducer_plus
# of formulae: 91
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ------ ------ ------ ------ ----- ------- --------- ---- -------- -------
z3-noodler-86a65bd-2756940 0 91 0.00 nan nan nan 0 0 91 0 0 0
z3-noodler-7801978-2756940 0 91 0.00 nan nan nan 0 0 91 0 0 0
cvc5-1.1.2 42 49 490.70 11.68 3.24 22.98 41 1 0 49 0 0
z3-4.13.0 1 90 0.01 0.01 0.01 nan 0 1 90 0 0 0
--------------------------------------------------
Benchmark leetcode
# of formulae: 2652
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-86a65bd-2756940 2651 1 56.66 0.02 0.01 0.08 867 1784 0 0 1 0
z3-noodler-7801978-2756940 2648 4 48.80 0.02 0.01 0.04 864 1784 0 0 4 0
cvc5-1.1.2 2652 0 32.91 0.01 0.00 0.03 867 1785 0 0 0 0
z3-4.13.0 2652 0 41.89 0.02 0.01 0.11 867 1785 0 0 0 0
--------------------------------------------------
Benchmark str_small_rw
# of formulae: 1880
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-86a65bd-2756940 1814 66 408.87 0.23 0.01 3.13 4 1810 6 36 24 0
z3-noodler-7801978-2756940 1814 66 370.39 0.20 0.01 2.74 4 1810 6 29 31 0
cvc5-1.1.2 1861 19 31.09 0.02 0.00 0.56 20 1841 2 17 0 0
z3-4.13.0 1821 59 202.53 0.11 0.01 2.66 25 1796 0 59 0 0
--------------------------------------------------
Benchmark pyex
# of formulae: 23845
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ----- ---- -------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-86a65bd-2756940 23740 105 9588.22 0.40 0.10 3.21 20369 3371 0 105 0 0
z3-noodler-7801978-2756940 23751 94 4313.22 0.18 0.10 1.40 20380 3371 0 94 0 0
cvc5-1.1.2 23826 19 6384.00 0.27 0.07 2.48 20406 3420 0 19 0 0
z3-4.13.0 22895 950 55228.54 2.41 0.12 10.38 19624 3271 0 950 0 0
--------------------------------------------------
Benchmark full_str_int
# of formulae: 16968
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ----- ---- -------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-86a65bd-2756940 16799 169 3835.43 0.23 0.01 3.17 2491 14308 117 50 2 0
z3-noodler-7801978-2756940 16801 167 3801.76 0.23 0.01 3.12 2493 14308 117 50 0 0
cvc5-1.1.2 16963 5 5095.10 0.30 0.01 1.42 2521 14442 0 5 0 0
z3-4.13.0 16732 236 18620.60 1.11 0.01 6.38 2476 14256 0 236 0 0
--------------------------------------------------
Benchmark snia
# of formulae: 70
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ---- ---- ------ ------ ------ ------ ----- ------- --------- ---- -------- -------
z3-noodler-86a65bd-2756940 70 0 0.81 0.01 0.01 0.01 70 0 0 0 0 0
z3-noodler-7801978-2756940 70 0 0.76 0.01 0.01 0.01 70 0 0 0 0 0
cvc5-1.1.2 70 0 0.00 0.00 0.00 0.00 70 0 0 0 0 0
z3-4.13.0 0 70 0.00 nan nan nan 0 0 0 0 0 70
--------------------------------------------------
The current results are in #164, I accidentally posted them there :D
It is ready to merge from my side.
Further model generation fixes:
model_value_proc
)util
totheory_str_noodler