VeriFIT / z3-noodler

The Z3-Noodler String Solver
Other
6 stars 5 forks source link

Model generation fixing #156

Closed jurajsic closed 1 month ago

jurajsic commented 2 months ago

This PR fixes a lot of stuff for model generation for the main decision, mainly:

We also use new functions from mata (verifit/mata#420 and verifit/mata#424) in multiple-memberships heuristic to get models from language difference and complement.

I also added new trace output for preprocessing (str-prep) which will print the current rule that is gonna be used and after it was used, the current state of preprocessor (if you want to print NFAs in assignment, use also str-nfa). This can be very useful in debugging preprocessing.

I also changed the parameter str.produce_model to the parameter model, which is already in Z3 and I made it by default false (it also turns off models for LIA solver). Finally, I made noodler the default parameter for string solver selection.

jurajsic commented 2 months ago

The impact of changes (with models turned off):

# of formulae: 103405
--------------------------------------------------
tool                            ✅    ❌       time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ------  ----  ---------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-41c6e48-7c3f5d0  102380  1025    7811.82   0.08   0.01   0.69  62888    39492        528   469        28        0
z3-noodler-9dacadd-bdcb62d  102393  1012   11618.30   0.11   0.02   1.53  62911    39482        537   409        66        0
cvc5-1.1.2                   99774  3631   76479.19   0.77   0.00   6.77  60870    38904          2  3620         9        0
z3-4.13.0                    97694  5711  132044.66   1.35   0.01   8.11  58876    38818        210  4968       463       70
--------------------------------------------------

Benchmark sygus_qgen
# of formulae: 343
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-41c6e48-7c3f5d0   343     0    2.96   0.01   0.01   0.00    343        0          0     0         0        0
z3-noodler-9dacadd-bdcb62d   343     0    4.57   0.01   0.01   0.01    343        0          0     0         0        0
cvc5-1.1.2                   343     0   79.25   0.23   0.04   0.94    343        0          0     0         0        0
z3-4.13.0                    343     0   10.38   0.03   0.02   0.02    343        0          0     0         0        0
--------------------------------------------------

Benchmark denghang
# of formulae: 999
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-41c6e48-7c3f5d0   999     0   35.67   0.04   0.01   0.57    186      813          0     0         0        0
z3-noodler-9dacadd-bdcb62d   999     0   41.76   0.04   0.02   0.59    186      813          0     0         0        0
cvc5-1.1.2                   981    18  292.35   0.30   0.00   3.71    169      812          0    18         0        0
z3-4.13.0                    883   116  342.20   0.39   0.09   2.17    186      697          0   105        11        0
--------------------------------------------------

Benchmark automatark
# of formulae: 15995
--------------------------------------------------
tool                           ✅    ❌     time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  -----  ----  -------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-41c6e48-7c3f5d0  15990     5   442.43   0.03   0.01   0.28  10477     5513          0     5         0        0
z3-noodler-9dacadd-bdcb62d  15990     5   537.15   0.03   0.01   0.30  10477     5513          0     5         0        0
cvc5-1.1.2                  15902    93  2216.97   0.14   0.00   2.83  10460     5442          0    85         8        0
z3-4.13.0                   15870   125  6425.06   0.40   0.02   3.90  10469     5401          0   124         1        0
--------------------------------------------------

Benchmark stringfuzz
# of formulae: 11618
--------------------------------------------------
tool                           ✅    ❌      time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  -----  ----  --------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-41c6e48-7c3f5d0  11616     2    201.22   0.02   0.01   0.16   6354     5262          1     1         0        0
z3-noodler-9dacadd-bdcb62d  11616     2    265.10   0.02   0.01   0.17   6354     5262          1     1         0        0
cvc5-1.1.2                  10915   703  30628.85   2.81   0.00  12.55   6158     4757          0   702         1        0
z3-4.13.0                   11081   537  45873.06   4.14   0.01  15.39   5995     5086          0   412       125        0
--------------------------------------------------

Benchmark redos
# of formulae: 3287
--------------------------------------------------
tool                          ✅    ❌      time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  --------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-41c6e48-7c3f5d0  3281     6   1396.81   0.43   0.02   1.54   2441      840          0     6         0        0
z3-noodler-9dacadd-bdcb62d  3285     2   2633.42   0.80   0.03   3.77   2445      840          0     2         0        0
cvc5-1.1.2                  1070  2217  27894.06  26.07   0.03  38.26    574      496          0  2217         0        0
z3-4.13.0                    888  2399    326.46   0.37   0.01   2.85     48      840          0  2099       300        0
--------------------------------------------------

Benchmark norn
# of formulae: 1027
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-41c6e48-7c3f5d0  1027     0    9.41   0.01   0.01   0.00    712      315          0     0         0        0
z3-noodler-9dacadd-bdcb62d  1027     0   14.97   0.01   0.01   0.01    712      315          0     0         0        0
cvc5-1.1.2                   943    84  279.95   0.30   0.01   2.77    705      238          0    84         0        0
z3-4.13.0                    903   124  174.62   0.19   0.02   3.88    712      191          0   124         0        0
--------------------------------------------------

Benchmark slog
# of formulae: 1976
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-41c6e48-7c3f5d0  1976     0   18.89   0.01   0.01   0.02    808     1168          0     0         0        0
z3-noodler-9dacadd-bdcb62d  1976     0   28.66   0.01   0.01   0.02    808     1168          0     0         0        0
cvc5-1.1.2                  1976     0    1.10   0.00   0.00   0.00    808     1168          0     0         0        0
z3-4.13.0                   1945    31  488.71   0.25   0.01   3.22    777     1168          0    31         0        0
--------------------------------------------------

Benchmark slent
# of formulae: 1128
--------------------------------------------------
tool                          ✅    ❌     time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  -------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-41c6e48-7c3f5d0  1123     5   118.81   0.11   0.01   1.80    625      498          0     5         0        0
z3-noodler-9dacadd-bdcb62d  1124     4   230.90   0.21   0.01   3.09    626      498          0     4         0        0
cvc5-1.1.2                  1104    24  1079.50   0.98   0.00   6.51    618      486          0    24         0        0
z3-4.13.0                   1055    73   385.56   0.37   0.01   4.01    571      484          0    73         0        0
--------------------------------------------------

Benchmark omark
# of formulae: 17
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-41c6e48-7c3f5d0     9     8    0.13   0.01   0.01   0.02      2        7          0     7         1        0
z3-noodler-9dacadd-bdcb62d     9     8    0.17   0.02   0.02   0.01      2        7          0     6         2        0
cvc5-1.1.2                     5    12    0.95   0.19   0.01   0.41      2        3          0    12         0        0
z3-4.13.0                      7    10    7.67   1.10   0.01   2.86      2        5          0     8         2        0
--------------------------------------------------

Benchmark kepler
# of formulae: 587
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-41c6e48-7c3f5d0   584     3   12.42   0.02   0.01   0.06    284      300          0     3         0        0
z3-noodler-9dacadd-bdcb62d   584     3   15.64   0.03   0.01   0.07    284      300          0     3         0        0
cvc5-1.1.2                   347   240    0.54   0.00   0.00   0.00     50      297          0   240         0        0
z3-4.13.0                    278   309   24.64   0.09   0.01   0.87    135      143          0   285        24        0
--------------------------------------------------

Benchmark woorpje
# of formulae: 809
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-41c6e48-7c3f5d0   773    36  155.90   0.20   0.01   1.93    611      162          0    34         2        0
z3-noodler-9dacadd-bdcb62d   774    35  265.04   0.34   0.01   4.03    612      162          0    18        17        0
cvc5-1.1.2                   755    54   45.21   0.06   0.01   0.23    591      164          0    54         0        0
z3-4.13.0                    782    27  300.36   0.38   0.02   4.80    618      164          0    27         0        0
--------------------------------------------------

Benchmark webapp
# of formulae: 681
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-41c6e48-7c3f5d0   363   318   10.57   0.03   0.01   0.05    135      228        315     1         2        0
z3-noodler-9dacadd-bdcb62d   363   318   15.77   0.04   0.02   0.08    135      228        316     0         2        0
cvc5-1.1.2                   588    93  864.69   1.47   0.01   9.96    165      423          0    93         0        0
z3-4.13.0                    450   231  312.63   0.69   0.02   6.78     36      414        120   111         0        0
--------------------------------------------------

Benchmark kaluza
# of formulae: 19432
--------------------------------------------------
tool                           ✅    ❌     time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  -----  ----  -------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-41c6e48-7c3f5d0  19211   221   556.74   0.03   0.01   0.58  16112     3099          0   215         6        0
z3-noodler-9dacadd-bdcb62d  19210   222   949.14   0.05   0.01   1.30  16114     3096          0   209        13        0
cvc5-1.1.2                  19431     1  1061.97   0.05   0.00   1.87  16302     3129          0     1         0        0
z3-4.13.0                   19148   284  2019.12   0.11   0.01   2.02  16032     3116          0   284         0        0
--------------------------------------------------

Benchmark transducer_plus
# of formulae: 91
--------------------------------------------------
tool                          ✅    ❌    time     avg     med     std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  ------  ------  ------  -----  -------  ---------  ----  --------  -------
z3-noodler-41c6e48-7c3f5d0     0    91    0.00  nan     nan     nan         0        0         91     0         0        0
z3-noodler-9dacadd-bdcb62d     0    91    0.00  nan     nan     nan         0        0         91     0         0        0
cvc5-1.1.2                    42    49  490.70   11.68    3.24   22.98     41        1          0    49         0        0
z3-4.13.0                      1    90    0.01    0.01    0.01  nan         0        1         90     0         0        0
--------------------------------------------------

Benchmark leetcode
# of formulae: 2652
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-41c6e48-7c3f5d0  2649     3   37.45   0.01   0.01   0.04    865     1784          0     0         3        0
z3-noodler-9dacadd-bdcb62d  2649     3   55.18   0.02   0.01   0.07    865     1784          0     0         3        0
cvc5-1.1.2                  2652     0   32.91   0.01   0.00   0.03    867     1785          0     0         0        0
z3-4.13.0                   2652     0   43.44   0.02   0.01   0.12    867     1785          0     0         0        0
--------------------------------------------------

Benchmark str_small_rw
# of formulae: 1880
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-41c6e48-7c3f5d0  1815    65  175.31   0.10   0.01   1.10      4     1811          4    48        13        0
z3-noodler-9dacadd-bdcb62d  1816    64  354.82   0.20   0.01   2.91      4     1812          4    33        27        0
cvc5-1.1.2                  1861    19   31.09   0.02   0.00   0.56     20     1841          2    17         0        0
z3-4.13.0                   1821    59  214.88   0.12   0.01   2.86     25     1796          0    59         0        0
--------------------------------------------------

Benchmark pyex
# of formulae: 23845
--------------------------------------------------
tool                           ✅    ❌      time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  -----  ----  --------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-41c6e48-7c3f5d0  23746    99   3594.16   0.15   0.08   0.80  20375     3371          0    98         1        0
z3-noodler-9dacadd-bdcb62d  23753    92   4553.94   0.19   0.10   1.66  20382     3371          0    90         2        0
cvc5-1.1.2                  23826    19   6384.00   0.27   0.07   2.48  20406     3420          0    19         0        0
z3-4.13.0                   22858   987  54817.37   2.40   0.12  10.23  19587     3271          0   987         0        0
--------------------------------------------------

Benchmark full_str_int
# of formulae: 16968
--------------------------------------------------
tool                           ✅    ❌      time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  -----  ----  --------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-41c6e48-7c3f5d0  16805   163   1042.35   0.06   0.01   0.67   2484    14321        117    46         0        0
z3-noodler-9dacadd-bdcb62d  16805   163   1651.36   0.10   0.01   1.78   2492    14313        125    38         0        0
cvc5-1.1.2                  16963     5   5095.10   0.30   0.01   1.42   2521    14442          0     5         0        0
z3-4.13.0                   16729   239  20278.49   1.21   0.01   6.90   2473    14256          0   239         0        0
--------------------------------------------------

Benchmark snia
# of formulae: 70
--------------------------------------------------
tool                          ✅    ❌    time     avg     med     std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  ------  ------  ------  -----  -------  ---------  ----  --------  -------
z3-noodler-41c6e48-7c3f5d0    70     0    0.59    0.01    0.01    0.00     70        0          0     0         0        0
z3-noodler-9dacadd-bdcb62d    70     0    0.71    0.01    0.01    0.00     70        0          0     0         0        0
cvc5-1.1.2                    70     0    0.00    0.00    0.00    0.00     70        0          0     0         0        0
z3-4.13.0                      0    70    0.00  nan     nan     nan         0        0          0     0         0       70
--------------------------------------------------

image

I think that by turning model generation off for LIA, we became faster. I will need to run Z3 without model generation to see how it behaves too.

jurajsic commented 2 months ago

Note that there are still some incorrect models and some bugs I need to handle, but I want to first merge all these changes.