Closed jurajsic closed 1 week ago
Results:
# of formulae: 103405
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ------ ---- --------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-1cae537-17ffaf6 102432 973 12422.91 0.12 0.02 1.74 62943 39489 530 379 64 0
z3-noodler-f44faf4-17ffaf6 102439 966 12031.70 0.12 0.02 1.60 62952 39487 530 371 65 0
z3-noodler-model-1cae537-17ffaf6 102387 1018 8669.92 0.08 0.02 0.65 62906 39481 534 444 0 40
z3-noodler-model-f44faf4-17ffaf6 102372 1033 8512.32 0.08 0.02 0.64 62891 39481 548 445 0 40
cvc5-1.2.0 99841 3564 73861.19 0.74 0.01 6.42 60931 38910 2 3553 9 0
z3-4.13.0 97745 5660 128712.47 1.32 0.01 7.95 58927 38818 211 4914 465 70
--------------------------------------------------
Benchmark sygus_qgen
# of formulae: 343
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-1cae537-17ffaf6 343 0 4.62 0.01 0.01 0.01 343 0 0 0 0 0
z3-noodler-f44faf4-17ffaf6 343 0 4.57 0.01 0.01 0.01 343 0 0 0 0 0
z3-noodler-model-1cae537-17ffaf6 343 0 3.83 0.01 0.01 0.00 343 0 0 0 0 0
z3-noodler-model-f44faf4-17ffaf6 343 0 4.05 0.01 0.01 0.00 343 0 0 0 0 0
cvc5-1.2.0 343 0 80.99 0.24 0.04 0.93 343 0 0 0 0 0
z3-4.13.0 343 0 10.35 0.03 0.02 0.02 343 0 0 0 0 0
--------------------------------------------------
Benchmark denghang
# of formulae: 999
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-1cae537-17ffaf6 999 0 43.22 0.04 0.02 0.64 186 813 0 0 0 0
z3-noodler-f44faf4-17ffaf6 999 0 42.63 0.04 0.02 0.60 186 813 0 0 0 0
z3-noodler-model-1cae537-17ffaf6 999 0 40.03 0.04 0.02 0.57 186 813 0 0 0 0
z3-noodler-model-f44faf4-17ffaf6 999 0 39.68 0.04 0.02 0.55 186 813 0 0 0 0
cvc5-1.2.0 982 17 371.55 0.38 0.00 4.64 170 812 0 17 0 0
z3-4.13.0 883 116 326.28 0.37 0.08 2.10 186 697 0 105 11 0
--------------------------------------------------
Benchmark automatark
# of formulae: 15995
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ----- ---- ------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-1cae537-17ffaf6 15990 5 747.84 0.05 0.01 0.53 10477 5513 0 5 0 0
z3-noodler-f44faf4-17ffaf6 15990 5 741.49 0.05 0.01 0.51 10477 5513 0 5 0 0
z3-noodler-model-1cae537-17ffaf6 15990 5 727.48 0.05 0.01 0.48 10477 5513 0 5 0 0
z3-noodler-model-f44faf4-17ffaf6 15990 5 724.77 0.05 0.01 0.48 10477 5513 0 5 0 0
cvc5-1.2.0 15903 92 2196.84 0.14 0.00 2.77 10460 5443 0 84 8 0
z3-4.13.0 15871 124 6107.03 0.38 0.02 3.72 10469 5402 0 123 1 0
--------------------------------------------------
Benchmark stringfuzz
# of formulae: 11618
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ----- ---- -------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-1cae537-17ffaf6 11616 2 322.55 0.03 0.01 0.20 6354 5262 1 1 0 0
z3-noodler-f44faf4-17ffaf6 11616 2 325.89 0.03 0.01 0.18 6354 5262 1 1 0 0
z3-noodler-model-1cae537-17ffaf6 11615 3 300.65 0.03 0.01 0.17 6353 5262 1 2 0 0
z3-noodler-model-f44faf4-17ffaf6 11616 2 364.37 0.03 0.01 0.58 6354 5262 1 1 0 0
cvc5-1.2.0 10946 672 30049.07 2.75 0.00 12.34 6184 4762 0 671 1 0
z3-4.13.0 11090 528 44638.05 4.03 0.01 14.89 6004 5086 0 403 125 0
--------------------------------------------------
Benchmark redos
# of formulae: 3287
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ---- ---- -------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-1cae537-17ffaf6 3285 2 2633.81 0.80 0.03 3.77 2445 840 0 2 0 0
z3-noodler-f44faf4-17ffaf6 3285 2 2620.30 0.80 0.03 3.77 2445 840 0 2 0 0
z3-noodler-model-1cae537-17ffaf6 3281 6 1462.40 0.45 0.03 1.61 2441 840 0 6 0 0
z3-noodler-model-f44faf4-17ffaf6 3281 6 1448.69 0.44 0.03 1.57 2441 840 0 6 0 0
cvc5-1.2.0 1116 2171 27007.04 24.20 0.14 34.28 620 496 0 2171 0 0
z3-4.13.0 888 2399 297.55 0.34 0.01 2.60 48 840 0 2099 300 0
--------------------------------------------------
Benchmark norn
# of formulae: 1027
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-1cae537-17ffaf6 1027 0 14.38 0.01 0.01 0.01 712 315 0 0 0 0
z3-noodler-f44faf4-17ffaf6 1027 0 14.50 0.01 0.01 0.01 712 315 0 0 0 0
z3-noodler-model-1cae537-17ffaf6 1027 0 13.15 0.01 0.01 0.01 712 315 0 0 0 0
z3-noodler-model-f44faf4-17ffaf6 1027 0 13.21 0.01 0.01 0.01 712 315 0 0 0 0
cvc5-1.2.0 943 84 247.71 0.26 0.01 2.41 705 238 0 84 0 0
z3-4.13.0 904 123 278.96 0.31 0.02 5.21 712 192 0 123 0 0
--------------------------------------------------
Benchmark slog
# of formulae: 1976
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-1cae537-17ffaf6 1976 0 27.56 0.01 0.01 0.02 808 1168 0 0 0 0
z3-noodler-f44faf4-17ffaf6 1976 0 28.37 0.01 0.01 0.02 808 1168 0 0 0 0
z3-noodler-model-1cae537-17ffaf6 1976 0 24.40 0.01 0.01 0.02 808 1168 0 0 0 0
z3-noodler-model-f44faf4-17ffaf6 1976 0 24.88 0.01 0.01 0.02 808 1168 0 0 0 0
cvc5-1.2.0 1976 0 1.85 0.00 0.00 0.00 808 1168 0 0 0 0
z3-4.13.0 1956 20 539.58 0.28 0.01 3.81 788 1168 0 20 0 0
--------------------------------------------------
Benchmark slent
# of formulae: 1128
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ---- ---- ------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-1cae537-17ffaf6 1125 3 244.15 0.22 0.01 2.99 627 498 0 3 0 0
z3-noodler-f44faf4-17ffaf6 1125 3 245.89 0.22 0.01 3.02 627 498 0 3 0 0
z3-noodler-model-1cae537-17ffaf6 1123 5 96.84 0.09 0.01 1.15 625 498 0 5 0 0
z3-noodler-model-f44faf4-17ffaf6 1112 16 96.48 0.09 0.01 1.14 614 498 11 5 0 0
cvc5-1.2.0 1104 24 1002.26 0.91 0.00 6.12 618 486 0 24 0 0
z3-4.13.0 1052 76 345.30 0.33 0.01 4.17 571 481 0 76 0 0
--------------------------------------------------
Benchmark omark
# of formulae: 17
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-1cae537-17ffaf6 9 8 0.13 0.01 0.01 0.02 2 7 0 6 2 0
z3-noodler-f44faf4-17ffaf6 9 8 0.15 0.02 0.01 0.01 2 7 0 6 2 0
z3-noodler-model-1cae537-17ffaf6 8 9 0.11 0.01 0.01 0.01 1 7 1 7 0 1
z3-noodler-model-f44faf4-17ffaf6 8 9 0.15 0.02 0.01 0.01 1 7 1 7 0 1
cvc5-1.2.0 5 12 1.03 0.21 0.00 0.46 2 3 0 12 0 0
z3-4.13.0 7 10 8.06 1.15 0.01 3.01 2 5 0 8 2 0
--------------------------------------------------
Benchmark kepler
# of formulae: 587
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-1cae537-17ffaf6 584 3 15.91 0.03 0.01 0.07 284 300 0 3 0 0
z3-noodler-f44faf4-17ffaf6 584 3 16.04 0.03 0.01 0.07 284 300 0 3 0 0
z3-noodler-model-1cae537-17ffaf6 584 3 49.08 0.08 0.01 0.17 284 300 0 3 0 0
z3-noodler-model-f44faf4-17ffaf6 584 3 49.63 0.08 0.02 0.17 284 300 0 3 0 0
cvc5-1.2.0 347 240 0.86 0.00 0.00 0.00 50 297 0 240 0 0
z3-4.13.0 278 309 25.95 0.09 0.02 0.90 135 143 0 283 26 0
--------------------------------------------------
Benchmark woorpje
# of formulae: 809
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-1cae537-17ffaf6 777 32 411.26 0.53 0.01 6.15 615 162 0 13 19 0
z3-noodler-f44faf4-17ffaf6 776 33 302.00 0.39 0.01 4.74 614 162 0 14 19 0
z3-noodler-model-1cae537-17ffaf6 770 39 127.66 0.17 0.02 1.59 608 162 2 28 0 9
z3-noodler-model-f44faf4-17ffaf6 770 39 126.60 0.16 0.02 1.58 608 162 2 29 0 8
cvc5-1.2.0 755 54 47.07 0.06 0.01 0.25 591 164 0 54 0 0
z3-4.13.0 782 27 295.18 0.38 0.02 4.74 618 164 0 27 0 0
--------------------------------------------------
Benchmark webapp
# of formulae: 681
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-1cae537-17ffaf6 365 316 221.00 0.61 0.02 5.33 137 228 315 0 1 0
z3-noodler-f44faf4-17ffaf6 361 320 17.68 0.05 0.02 0.24 133 228 315 0 5 0
z3-noodler-model-1cae537-17ffaf6 362 319 141.27 0.39 0.02 3.38 135 227 316 2 0 1
z3-noodler-model-f44faf4-17ffaf6 357 324 35.04 0.10 0.02 1.16 130 227 318 1 0 5
cvc5-1.2.0 588 93 767.89 1.31 0.02 8.73 165 423 0 93 0 0
z3-4.13.0 451 230 317.75 0.70 0.02 6.71 37 414 121 109 0 0
--------------------------------------------------
Benchmark kaluza
# of formulae: 19432
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ----- ---- ------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-1cae537-17ffaf6 19233 199 1590.31 0.08 0.01 2.39 16136 3097 0 185 14 0
z3-noodler-f44faf4-17ffaf6 19245 187 1324.71 0.07 0.01 1.95 16151 3094 0 173 14 0
z3-noodler-model-1cae537-17ffaf6 19224 208 531.98 0.03 0.01 0.52 16131 3093 0 204 0 4
z3-noodler-model-f44faf4-17ffaf6 19230 202 379.59 0.02 0.01 0.29 16138 3092 0 199 0 3
cvc5-1.2.0 19426 6 459.94 0.02 0.00 0.42 16297 3129 0 6 0 0
z3-4.13.0 19139 293 1428.86 0.07 0.01 1.71 16022 3117 0 293 0 0
--------------------------------------------------
Benchmark transducer_plus
# of formulae: 91
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ---- ---- ------ ------ ------ ------ ----- ------- --------- ---- -------- -------
z3-noodler-1cae537-17ffaf6 0 91 0.00 nan nan nan 0 0 91 0 0 0
z3-noodler-f44faf4-17ffaf6 0 91 0.00 nan nan nan 0 0 91 0 0 0
z3-noodler-model-1cae537-17ffaf6 0 91 0.00 nan nan nan 0 0 91 0 0 0
z3-noodler-model-f44faf4-17ffaf6 0 91 0.00 nan nan nan 0 0 91 0 0 0
cvc5-1.2.0 42 49 482.14 11.48 3.11 22.06 41 1 0 49 0 0
z3-4.13.0 1 90 0.01 0.01 0.01 nan 0 1 90 0 0 0
--------------------------------------------------
Benchmark leetcode
# of formulae: 2652
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-1cae537-17ffaf6 2649 3 52.31 0.02 0.01 0.06 865 1784 0 0 3 0
z3-noodler-f44faf4-17ffaf6 2648 4 52.81 0.02 0.01 0.05 864 1784 0 0 4 0
z3-noodler-model-1cae537-17ffaf6 2642 10 48.03 0.02 0.01 0.02 858 1784 0 0 0 10
z3-noodler-model-f44faf4-17ffaf6 2641 11 47.91 0.02 0.01 0.02 857 1784 0 0 0 11
cvc5-1.2.0 2652 0 36.00 0.01 0.00 0.03 867 1785 0 0 0 0
z3-4.13.0 2652 0 41.89 0.02 0.01 0.11 867 1785 0 0 0 0
--------------------------------------------------
Benchmark str_small_rw
# of formulae: 1880
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-1cae537-17ffaf6 1814 66 400.52 0.22 0.01 3.15 4 1810 6 35 25 0
z3-noodler-f44faf4-17ffaf6 1815 65 443.10 0.24 0.01 3.27 4 1811 6 38 21 0
z3-noodler-model-1cae537-17ffaf6 1811 69 161.17 0.09 0.01 0.92 4 1807 6 48 0 15
z3-noodler-model-f44faf4-17ffaf6 1811 69 191.59 0.11 0.02 1.13 3 1808 7 50 0 12
cvc5-1.2.0 1860 20 34.43 0.02 0.00 0.62 19 1841 2 18 0 0
z3-4.13.0 1821 59 202.53 0.11 0.01 2.66 25 1796 0 59 0 0
--------------------------------------------------
Benchmark pyex
# of formulae: 23845
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ----- ---- -------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-1cae537-17ffaf6 23754 91 4317.89 0.18 0.10 1.26 20383 3371 0 91 0 0
z3-noodler-f44faf4-17ffaf6 23752 93 4278.47 0.18 0.10 1.14 20381 3371 0 93 0 0
z3-noodler-model-1cae537-17ffaf6 23750 95 4019.24 0.17 0.10 0.71 20379 3371 0 95 0 0
z3-noodler-model-f44faf4-17ffaf6 23744 101 3997.50 0.17 0.10 0.72 20373 3371 0 101 0 0
cvc5-1.2.0 23820 25 6002.12 0.25 0.07 2.51 20400 3420 0 25 0 0
z3-4.13.0 22895 950 55228.54 2.41 0.12 10.38 19624 3271 0 950 0 0
--------------------------------------------------
Benchmark full_str_int
# of formulae: 16968
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ----- ---- -------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-1cae537-17ffaf6 16816 152 1374.70 0.08 0.01 1.56 2495 14321 117 35 0 0
z3-noodler-f44faf4-17ffaf6 16818 150 1572.33 0.09 0.01 1.86 2497 14321 117 33 0 0
z3-noodler-model-1cae537-17ffaf6 16812 156 921.87 0.05 0.01 0.46 2491 14321 117 39 0 0
z3-noodler-model-f44faf4-17ffaf6 16813 155 967.34 0.06 0.01 0.54 2492 14321 117 38 0 0
cvc5-1.2.0 16963 5 5072.37 0.30 0.01 1.39 2521 14442 0 5 0 0
z3-4.13.0 16732 236 18620.60 1.11 0.01 6.38 2476 14256 0 236 0 0
--------------------------------------------------
Benchmark snia
# of formulae: 70
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ---- ---- ------ ------ ------ ------ ----- ------- --------- ---- -------- -------
z3-noodler-1cae537-17ffaf6 70 0 0.75 0.01 0.01 0.00 70 0 0 0 0 0
z3-noodler-f44faf4-17ffaf6 70 0 0.77 0.01 0.01 0.00 70 0 0 0 0 0
z3-noodler-model-1cae537-17ffaf6 70 0 0.73 0.01 0.01 0.00 70 0 0 0 0 0
z3-noodler-model-f44faf4-17ffaf6 70 0 0.84 0.01 0.01 0.00 70 0 0 0 0 0
cvc5-1.2.0 70 0 0.03 0.00 0.00 0.00 70 0 0 0 0 0
z3-4.13.0 0 70 0.00 nan nan nan 0 0 0 0 0 70
--------------------------------------------------
Note the reduction of unknowns between z3-noodler-model-f44faf4-17ffaf6
and z3-noodler-model-1cae537-17ffaf6
, those were the formulae where we thought that inclusion graph contains cycle but it was just caused by preprocessing. There are only 3-4 formulae where we fail with model generation because of cycle now (and they are real non-chain-free formulae):
QF_S/20240318-omark/noodles-unsat-10.smt2
QF_SLIA/20230329-woorpje-lu/track05/05_track_169.smt2
QF_SLIA/20230329-woorpje-lu/track05/05_track_25.smt2
QF_SLIA/20230403-webapp/lan-rep-all/lan_replace_all100.smt2
The last one might fail because of replace_all
operator and not because it is not chain-free.
The scatter plot: The old version is down.
Good catch with the if-else
, there was a degradation, newest results:
# of formulae: 103405
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ------ ---- --------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-b7e4646-17ffaf6 102442 963 12295.38 0.12 0.02 1.70 62953 39489 530 372 61 0
z3-noodler-1cae537-17ffaf6 102432 973 12422.91 0.12 0.02 1.74 62943 39489 530 379 64 0
z3-noodler-f44faf4-17ffaf6 102439 966 12031.70 0.12 0.02 1.60 62952 39487 530 371 65 0
cvc5-1.2.0 99841 3564 73861.19 0.74 0.01 6.42 60931 38910 2 3553 9 0
z3-4.13.0 97745 5660 128712.47 1.32 0.01 7.95 58927 38818 211 4914 465 70
--------------------------------------------------
Comparison with if-else
version:
Comparison with previous version:
Nice, ready to merge for me.
The same equation but with different sides could be removed by
remove_regular
(for equation with one var on both sides) or byskip_len_sat
preprocessing rules, which would cause a cycle in the inclusion graph for model generation.