VeriFIT / z3-noodler

The Z3-Noodler String Solver
Other
9 stars 5 forks source link

Further model generation fixes #160

Closed jurajsic closed 3 months ago

jurajsic commented 3 months ago

Further model generation fixes:

jurajsic commented 3 months ago

The experiments are running, but looks like there are no more problems (except TOs and MEMOUTs) in model generation.

jurajsic commented 3 months ago

Results (with models turned off):

# of formulae: 103405
--------------------------------------------------
tool                            ✅    ❌       time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ------  ----  ---------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-86a65bd-2756940  102303  1102   19077.34   0.19   0.02   2.32  62822    39481        530   412       160        0
z3-noodler-7801978-2756940  102424   981   14358.39   0.14   0.02   1.99  62951    39473        530   378        73        0
cvc5-1.1.2                   99774  3631   76479.19   0.77   0.00   6.77  60870    38904          2  3620         9        0
z3-4.13.0                    97745  5660  128712.47   1.32   0.01   7.95  58927    38818        211  4914       465       70
--------------------------------------------------

Benchmark sygus_qgen
# of formulae: 343
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-86a65bd-2756940   343     0    4.57   0.01   0.01   0.01    343        0          0     0         0        0
z3-noodler-7801978-2756940   343     0    4.47   0.01   0.01   0.01    343        0          0     0         0        0
cvc5-1.1.2                   343     0   79.25   0.23   0.04   0.94    343        0          0     0         0        0
z3-4.13.0                    343     0   10.35   0.03   0.02   0.02    343        0          0     0         0        0
--------------------------------------------------

Benchmark denghang
# of formulae: 999
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-86a65bd-2756940   999     0   42.01   0.04   0.02   0.59    186      813          0     0         0        0
z3-noodler-7801978-2756940   999     0   41.79   0.04   0.02   0.58    186      813          0     0         0        0
cvc5-1.1.2                   981    18  292.35   0.30   0.00   3.71    169      812          0    18         0        0
z3-4.13.0                    883   116  326.28   0.37   0.08   2.10    186      697          0   105        11        0
--------------------------------------------------

Benchmark automatark
# of formulae: 15995
--------------------------------------------------
tool                           ✅    ❌     time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  -----  ----  -------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-86a65bd-2756940  15990     5   747.19   0.05   0.01   0.52  10477     5513          0     5         0        0
z3-noodler-7801978-2756940  15990     5   745.94   0.05   0.01   0.53  10477     5513          0     5         0        0
cvc5-1.1.2                  15902    93  2216.97   0.14   0.00   2.83  10460     5442          0    85         8        0
z3-4.13.0                   15871   124  6107.03   0.38   0.02   3.72  10469     5402          0   123         1        0
--------------------------------------------------

Benchmark stringfuzz
# of formulae: 11618
--------------------------------------------------
tool                           ✅    ❌      time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  -----  ----  --------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-86a65bd-2756940  11616     2    321.39   0.03   0.01   0.18   6354     5262          1     1         0        0
z3-noodler-7801978-2756940  11616     2    317.81   0.03   0.01   0.18   6354     5262          1     1         0        0
cvc5-1.1.2                  10915   703  30628.85   2.81   0.00  12.55   6158     4757          0   702         1        0
z3-4.13.0                   11090   528  44638.05   4.03   0.01  14.89   6004     5086          0   403       125        0
--------------------------------------------------

Benchmark redos
# of formulae: 3287
--------------------------------------------------
tool                          ✅    ❌      time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  --------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-86a65bd-2756940  3285     2   2632.00   0.80   0.03   3.79   2445      840          0     2         0        0
z3-noodler-7801978-2756940  3285     2   2629.82   0.80   0.03   3.81   2445      840          0     2         0        0
cvc5-1.1.2                  1070  2217  27894.06  26.07   0.03  38.26    574      496          0  2217         0        0
z3-4.13.0                    888  2399    297.55   0.34   0.01   2.60     48      840          0  2099       300        0
--------------------------------------------------

Benchmark norn
# of formulae: 1027
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-86a65bd-2756940  1027     0   14.53   0.01   0.01   0.01    712      315          0     0         0        0
z3-noodler-7801978-2756940  1027     0   14.28   0.01   0.01   0.01    712      315          0     0         0        0
cvc5-1.1.2                   943    84  279.95   0.30   0.01   2.77    705      238          0    84         0        0
z3-4.13.0                    904   123  278.96   0.31   0.02   5.21    712      192          0   123         0        0
--------------------------------------------------

Benchmark slog
# of formulae: 1976
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-86a65bd-2756940  1976     0   27.88   0.01   0.01   0.02    808     1168          0     0         0        0
z3-noodler-7801978-2756940  1976     0   27.37   0.01   0.01   0.02    808     1168          0     0         0        0
cvc5-1.1.2                  1976     0    1.10   0.00   0.00   0.00    808     1168          0     0         0        0
z3-4.13.0                   1956    20  539.58   0.28   0.01   3.81    788     1168          0    20         0        0
--------------------------------------------------

Benchmark slent
# of formulae: 1128
--------------------------------------------------
tool                          ✅    ❌     time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  -------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-86a65bd-2756940  1124     4    58.87   0.05   0.01   0.68    626      498          0     4         0        0
z3-noodler-7801978-2756940  1124     4   121.83   0.11   0.01   1.48    626      498          0     4         0        0
cvc5-1.1.2                  1104    24  1079.50   0.98   0.00   6.51    618      486          0    24         0        0
z3-4.13.0                   1052    76   345.30   0.33   0.01   4.17    571      481          0    76         0        0
--------------------------------------------------

Benchmark omark
# of formulae: 17
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-86a65bd-2756940     9     8    0.14   0.02   0.01   0.01      2        7          0     6         2        0
z3-noodler-7801978-2756940     9     8    0.15   0.02   0.01   0.02      2        7          0     6         2        0
cvc5-1.1.2                     5    12    0.95   0.19   0.01   0.41      2        3          0    12         0        0
z3-4.13.0                      7    10    8.06   1.15   0.01   3.01      2        5          0     8         2        0
--------------------------------------------------

Benchmark kepler
# of formulae: 587
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-86a65bd-2756940   584     3   15.75   0.03   0.01   0.07    284      300          0     3         0        0
z3-noodler-7801978-2756940   584     3   15.95   0.03   0.01   0.08    284      300          0     3         0        0
cvc5-1.1.2                   347   240    0.54   0.00   0.00   0.00     50      297          0   240         0        0
z3-4.13.0                    278   309   25.95   0.09   0.02   0.90    135      143          0   283        26        0
--------------------------------------------------

Benchmark woorpje
# of formulae: 809
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-86a65bd-2756940   774    35  205.60   0.27   0.01   2.47    612      162          0    15        20        0
z3-noodler-7801978-2756940   774    35  260.08   0.34   0.01   4.35    612      162          0    15        20        0
cvc5-1.1.2                   755    54   45.21   0.06   0.01   0.23    591      164          0    54         0        0
z3-4.13.0                    782    27  295.18   0.38   0.02   4.74    618      164          0    27         0        0
--------------------------------------------------

Benchmark webapp
# of formulae: 681
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-86a65bd-2756940   361   320   18.58   0.05   0.02   0.24    133      228        315     0         5        0
z3-noodler-7801978-2756940   361   320   17.78   0.05   0.02   0.24    133      228        315     0         5        0
cvc5-1.1.2                   588    93  864.69   1.47   0.01   9.96    165      423          0    93         0        0
z3-4.13.0                    451   230  317.75   0.70   0.02   6.71     37      414        121   109         0        0
--------------------------------------------------

Benchmark kaluza
# of formulae: 19432
--------------------------------------------------
tool                           ✅    ❌     time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  -----  ----  -------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-86a65bd-2756940  19141   291  1098.84   0.06   0.01   1.77  16039     3102          0   185       106        0
z3-noodler-7801978-2756940  19252   180  1626.19   0.08   0.01   2.37  16158     3094          0   169        11        0
cvc5-1.1.2                  19431     1  1061.97   0.05   0.00   1.87  16302     3129          0     1         0        0
z3-4.13.0                   19139   293  1428.86   0.07   0.01   1.71  16022     3117          0   293         0        0
--------------------------------------------------

Benchmark transducer_plus
# of formulae: 91
--------------------------------------------------
tool                          ✅    ❌    time     avg     med     std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  ------  ------  ------  -----  -------  ---------  ----  --------  -------
z3-noodler-86a65bd-2756940     0    91    0.00  nan     nan     nan         0        0         91     0         0        0
z3-noodler-7801978-2756940     0    91    0.00  nan     nan     nan         0        0         91     0         0        0
cvc5-1.1.2                    42    49  490.70   11.68    3.24   22.98     41        1          0    49         0        0
z3-4.13.0                      1    90    0.01    0.01    0.01  nan         0        1         90     0         0        0
--------------------------------------------------

Benchmark leetcode
# of formulae: 2652
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-86a65bd-2756940  2651     1   56.66   0.02   0.01   0.08    867     1784          0     0         1        0
z3-noodler-7801978-2756940  2648     4   48.80   0.02   0.01   0.04    864     1784          0     0         4        0
cvc5-1.1.2                  2652     0   32.91   0.01   0.00   0.03    867     1785          0     0         0        0
z3-4.13.0                   2652     0   41.89   0.02   0.01   0.11    867     1785          0     0         0        0
--------------------------------------------------

Benchmark str_small_rw
# of formulae: 1880
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-86a65bd-2756940  1814    66  408.87   0.23   0.01   3.13      4     1810          6    36        24        0
z3-noodler-7801978-2756940  1814    66  370.39   0.20   0.01   2.74      4     1810          6    29        31        0
cvc5-1.1.2                  1861    19   31.09   0.02   0.00   0.56     20     1841          2    17         0        0
z3-4.13.0                   1821    59  202.53   0.11   0.01   2.66     25     1796          0    59         0        0
--------------------------------------------------

Benchmark pyex
# of formulae: 23845
--------------------------------------------------
tool                           ✅    ❌      time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  -----  ----  --------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-86a65bd-2756940  23740   105   9588.22   0.40   0.10   3.21  20369     3371          0   105         0        0
z3-noodler-7801978-2756940  23751    94   4313.22   0.18   0.10   1.40  20380     3371          0    94         0        0
cvc5-1.1.2                  23826    19   6384.00   0.27   0.07   2.48  20406     3420          0    19         0        0
z3-4.13.0                   22895   950  55228.54   2.41   0.12  10.38  19624     3271          0   950         0        0
--------------------------------------------------

Benchmark full_str_int
# of formulae: 16968
--------------------------------------------------
tool                           ✅    ❌      time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  -----  ----  --------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-86a65bd-2756940  16799   169   3835.43   0.23   0.01   3.17   2491    14308        117    50         2        0
z3-noodler-7801978-2756940  16801   167   3801.76   0.23   0.01   3.12   2493    14308        117    50         0        0
cvc5-1.1.2                  16963     5   5095.10   0.30   0.01   1.42   2521    14442          0     5         0        0
z3-4.13.0                   16732   236  18620.60   1.11   0.01   6.38   2476    14256          0   236         0        0
--------------------------------------------------

Benchmark snia
# of formulae: 70
--------------------------------------------------
tool                          ✅    ❌    time     avg     med     std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  ------  ------  ------  -----  -------  ---------  ----  --------  -------
z3-noodler-86a65bd-2756940    70     0    0.81    0.01    0.01    0.01     70        0          0     0         0        0
z3-noodler-7801978-2756940    70     0    0.76    0.01    0.01    0.01     70        0          0     0         0        0
cvc5-1.1.2                    70     0    0.00    0.00    0.00    0.00     70        0          0     0         0        0
z3-4.13.0                      0    70    0.00  nan     nan     nan         0        0          0     0         0       70
--------------------------------------------------

image

jurajsic commented 3 months ago

The current results are in #164, I accidentally posted them there :D

vhavlena commented 3 months ago

It is ready to merge from my side.