VeriFIT / z3-noodler

The Z3-Noodler String Solver
Other
6 stars 5 forks source link

Do not remove the same equation twice in preprocessing #173

Closed jurajsic closed 1 week ago

jurajsic commented 1 week ago

The same equation but with different sides could be removed by remove_regular (for equation with one var on both sides) or by skip_len_sat preprocessing rules, which would cause a cycle in the inclusion graph for model generation.

jurajsic commented 1 week ago

Results:

# of formulae: 103405
--------------------------------------------------
tool                                  ✅    ❌       time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------------  ------  ----  ---------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-1cae537-17ffaf6        102432   973   12422.91   0.12   0.02   1.74  62943    39489        530   379        64        0
z3-noodler-f44faf4-17ffaf6        102439   966   12031.70   0.12   0.02   1.60  62952    39487        530   371        65        0
z3-noodler-model-1cae537-17ffaf6  102387  1018    8669.92   0.08   0.02   0.65  62906    39481        534   444         0       40
z3-noodler-model-f44faf4-17ffaf6  102372  1033    8512.32   0.08   0.02   0.64  62891    39481        548   445         0       40
cvc5-1.2.0                         99841  3564   73861.19   0.74   0.01   6.42  60931    38910          2  3553         9        0
z3-4.13.0                          97745  5660  128712.47   1.32   0.01   7.95  58927    38818        211  4914       465       70
--------------------------------------------------

Benchmark sygus_qgen
# of formulae: 343
--------------------------------------------------
tool                                ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-1cae537-17ffaf6         343     0    4.62   0.01   0.01   0.01    343        0          0     0         0        0
z3-noodler-f44faf4-17ffaf6         343     0    4.57   0.01   0.01   0.01    343        0          0     0         0        0
z3-noodler-model-1cae537-17ffaf6   343     0    3.83   0.01   0.01   0.00    343        0          0     0         0        0
z3-noodler-model-f44faf4-17ffaf6   343     0    4.05   0.01   0.01   0.00    343        0          0     0         0        0
cvc5-1.2.0                         343     0   80.99   0.24   0.04   0.93    343        0          0     0         0        0
z3-4.13.0                          343     0   10.35   0.03   0.02   0.02    343        0          0     0         0        0
--------------------------------------------------

Benchmark denghang
# of formulae: 999
--------------------------------------------------
tool                                ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-1cae537-17ffaf6         999     0   43.22   0.04   0.02   0.64    186      813          0     0         0        0
z3-noodler-f44faf4-17ffaf6         999     0   42.63   0.04   0.02   0.60    186      813          0     0         0        0
z3-noodler-model-1cae537-17ffaf6   999     0   40.03   0.04   0.02   0.57    186      813          0     0         0        0
z3-noodler-model-f44faf4-17ffaf6   999     0   39.68   0.04   0.02   0.55    186      813          0     0         0        0
cvc5-1.2.0                         982    17  371.55   0.38   0.00   4.64    170      812          0    17         0        0
z3-4.13.0                          883   116  326.28   0.37   0.08   2.10    186      697          0   105        11        0
--------------------------------------------------

Benchmark automatark
# of formulae: 15995
--------------------------------------------------
tool                                 ✅    ❌     time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------------  -----  ----  -------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-1cae537-17ffaf6        15990     5   747.84   0.05   0.01   0.53  10477     5513          0     5         0        0
z3-noodler-f44faf4-17ffaf6        15990     5   741.49   0.05   0.01   0.51  10477     5513          0     5         0        0
z3-noodler-model-1cae537-17ffaf6  15990     5   727.48   0.05   0.01   0.48  10477     5513          0     5         0        0
z3-noodler-model-f44faf4-17ffaf6  15990     5   724.77   0.05   0.01   0.48  10477     5513          0     5         0        0
cvc5-1.2.0                        15903    92  2196.84   0.14   0.00   2.77  10460     5443          0    84         8        0
z3-4.13.0                         15871   124  6107.03   0.38   0.02   3.72  10469     5402          0   123         1        0
--------------------------------------------------

Benchmark stringfuzz
# of formulae: 11618
--------------------------------------------------
tool                                 ✅    ❌      time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------------  -----  ----  --------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-1cae537-17ffaf6        11616     2    322.55   0.03   0.01   0.20   6354     5262          1     1         0        0
z3-noodler-f44faf4-17ffaf6        11616     2    325.89   0.03   0.01   0.18   6354     5262          1     1         0        0
z3-noodler-model-1cae537-17ffaf6  11615     3    300.65   0.03   0.01   0.17   6353     5262          1     2         0        0
z3-noodler-model-f44faf4-17ffaf6  11616     2    364.37   0.03   0.01   0.58   6354     5262          1     1         0        0
cvc5-1.2.0                        10946   672  30049.07   2.75   0.00  12.34   6184     4762          0   671         1        0
z3-4.13.0                         11090   528  44638.05   4.03   0.01  14.89   6004     5086          0   403       125        0
--------------------------------------------------

Benchmark redos
# of formulae: 3287
--------------------------------------------------
tool                                ✅    ❌      time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------------  ----  ----  --------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-1cae537-17ffaf6        3285     2   2633.81   0.80   0.03   3.77   2445      840          0     2         0        0
z3-noodler-f44faf4-17ffaf6        3285     2   2620.30   0.80   0.03   3.77   2445      840          0     2         0        0
z3-noodler-model-1cae537-17ffaf6  3281     6   1462.40   0.45   0.03   1.61   2441      840          0     6         0        0
z3-noodler-model-f44faf4-17ffaf6  3281     6   1448.69   0.44   0.03   1.57   2441      840          0     6         0        0
cvc5-1.2.0                        1116  2171  27007.04  24.20   0.14  34.28    620      496          0  2171         0        0
z3-4.13.0                          888  2399    297.55   0.34   0.01   2.60     48      840          0  2099       300        0
--------------------------------------------------

Benchmark norn
# of formulae: 1027
--------------------------------------------------
tool                                ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-1cae537-17ffaf6        1027     0   14.38   0.01   0.01   0.01    712      315          0     0         0        0
z3-noodler-f44faf4-17ffaf6        1027     0   14.50   0.01   0.01   0.01    712      315          0     0         0        0
z3-noodler-model-1cae537-17ffaf6  1027     0   13.15   0.01   0.01   0.01    712      315          0     0         0        0
z3-noodler-model-f44faf4-17ffaf6  1027     0   13.21   0.01   0.01   0.01    712      315          0     0         0        0
cvc5-1.2.0                         943    84  247.71   0.26   0.01   2.41    705      238          0    84         0        0
z3-4.13.0                          904   123  278.96   0.31   0.02   5.21    712      192          0   123         0        0
--------------------------------------------------

Benchmark slog
# of formulae: 1976
--------------------------------------------------
tool                                ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-1cae537-17ffaf6        1976     0   27.56   0.01   0.01   0.02    808     1168          0     0         0        0
z3-noodler-f44faf4-17ffaf6        1976     0   28.37   0.01   0.01   0.02    808     1168          0     0         0        0
z3-noodler-model-1cae537-17ffaf6  1976     0   24.40   0.01   0.01   0.02    808     1168          0     0         0        0
z3-noodler-model-f44faf4-17ffaf6  1976     0   24.88   0.01   0.01   0.02    808     1168          0     0         0        0
cvc5-1.2.0                        1976     0    1.85   0.00   0.00   0.00    808     1168          0     0         0        0
z3-4.13.0                         1956    20  539.58   0.28   0.01   3.81    788     1168          0    20         0        0
--------------------------------------------------

Benchmark slent
# of formulae: 1128
--------------------------------------------------
tool                                ✅    ❌     time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------------  ----  ----  -------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-1cae537-17ffaf6        1125     3   244.15   0.22   0.01   2.99    627      498          0     3         0        0
z3-noodler-f44faf4-17ffaf6        1125     3   245.89   0.22   0.01   3.02    627      498          0     3         0        0
z3-noodler-model-1cae537-17ffaf6  1123     5    96.84   0.09   0.01   1.15    625      498          0     5         0        0
z3-noodler-model-f44faf4-17ffaf6  1112    16    96.48   0.09   0.01   1.14    614      498         11     5         0        0
cvc5-1.2.0                        1104    24  1002.26   0.91   0.00   6.12    618      486          0    24         0        0
z3-4.13.0                         1052    76   345.30   0.33   0.01   4.17    571      481          0    76         0        0
--------------------------------------------------

Benchmark omark
# of formulae: 17
--------------------------------------------------
tool                                ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-1cae537-17ffaf6           9     8    0.13   0.01   0.01   0.02      2        7          0     6         2        0
z3-noodler-f44faf4-17ffaf6           9     8    0.15   0.02   0.01   0.01      2        7          0     6         2        0
z3-noodler-model-1cae537-17ffaf6     8     9    0.11   0.01   0.01   0.01      1        7          1     7         0        1
z3-noodler-model-f44faf4-17ffaf6     8     9    0.15   0.02   0.01   0.01      1        7          1     7         0        1
cvc5-1.2.0                           5    12    1.03   0.21   0.00   0.46      2        3          0    12         0        0
z3-4.13.0                            7    10    8.06   1.15   0.01   3.01      2        5          0     8         2        0
--------------------------------------------------

Benchmark kepler
# of formulae: 587
--------------------------------------------------
tool                                ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-1cae537-17ffaf6         584     3   15.91   0.03   0.01   0.07    284      300          0     3         0        0
z3-noodler-f44faf4-17ffaf6         584     3   16.04   0.03   0.01   0.07    284      300          0     3         0        0
z3-noodler-model-1cae537-17ffaf6   584     3   49.08   0.08   0.01   0.17    284      300          0     3         0        0
z3-noodler-model-f44faf4-17ffaf6   584     3   49.63   0.08   0.02   0.17    284      300          0     3         0        0
cvc5-1.2.0                         347   240    0.86   0.00   0.00   0.00     50      297          0   240         0        0
z3-4.13.0                          278   309   25.95   0.09   0.02   0.90    135      143          0   283        26        0
--------------------------------------------------

Benchmark woorpje
# of formulae: 809
--------------------------------------------------
tool                                ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-1cae537-17ffaf6         777    32  411.26   0.53   0.01   6.15    615      162          0    13        19        0
z3-noodler-f44faf4-17ffaf6         776    33  302.00   0.39   0.01   4.74    614      162          0    14        19        0
z3-noodler-model-1cae537-17ffaf6   770    39  127.66   0.17   0.02   1.59    608      162          2    28         0        9
z3-noodler-model-f44faf4-17ffaf6   770    39  126.60   0.16   0.02   1.58    608      162          2    29         0        8
cvc5-1.2.0                         755    54   47.07   0.06   0.01   0.25    591      164          0    54         0        0
z3-4.13.0                          782    27  295.18   0.38   0.02   4.74    618      164          0    27         0        0
--------------------------------------------------

Benchmark webapp
# of formulae: 681
--------------------------------------------------
tool                                ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-1cae537-17ffaf6         365   316  221.00   0.61   0.02   5.33    137      228        315     0         1        0
z3-noodler-f44faf4-17ffaf6         361   320   17.68   0.05   0.02   0.24    133      228        315     0         5        0
z3-noodler-model-1cae537-17ffaf6   362   319  141.27   0.39   0.02   3.38    135      227        316     2         0        1
z3-noodler-model-f44faf4-17ffaf6   357   324   35.04   0.10   0.02   1.16    130      227        318     1         0        5
cvc5-1.2.0                         588    93  767.89   1.31   0.02   8.73    165      423          0    93         0        0
z3-4.13.0                          451   230  317.75   0.70   0.02   6.71     37      414        121   109         0        0
--------------------------------------------------

Benchmark kaluza
# of formulae: 19432
--------------------------------------------------
tool                                 ✅    ❌     time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------------  -----  ----  -------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-1cae537-17ffaf6        19233   199  1590.31   0.08   0.01   2.39  16136     3097          0   185        14        0
z3-noodler-f44faf4-17ffaf6        19245   187  1324.71   0.07   0.01   1.95  16151     3094          0   173        14        0
z3-noodler-model-1cae537-17ffaf6  19224   208   531.98   0.03   0.01   0.52  16131     3093          0   204         0        4
z3-noodler-model-f44faf4-17ffaf6  19230   202   379.59   0.02   0.01   0.29  16138     3092          0   199         0        3
cvc5-1.2.0                        19426     6   459.94   0.02   0.00   0.42  16297     3129          0     6         0        0
z3-4.13.0                         19139   293  1428.86   0.07   0.01   1.71  16022     3117          0   293         0        0
--------------------------------------------------

Benchmark transducer_plus
# of formulae: 91
--------------------------------------------------
tool                                ✅    ❌    time     avg     med     std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------------  ----  ----  ------  ------  ------  ------  -----  -------  ---------  ----  --------  -------
z3-noodler-1cae537-17ffaf6           0    91    0.00  nan     nan     nan         0        0         91     0         0        0
z3-noodler-f44faf4-17ffaf6           0    91    0.00  nan     nan     nan         0        0         91     0         0        0
z3-noodler-model-1cae537-17ffaf6     0    91    0.00  nan     nan     nan         0        0         91     0         0        0
z3-noodler-model-f44faf4-17ffaf6     0    91    0.00  nan     nan     nan         0        0         91     0         0        0
cvc5-1.2.0                          42    49  482.14   11.48    3.11   22.06     41        1          0    49         0        0
z3-4.13.0                            1    90    0.01    0.01    0.01  nan         0        1         90     0         0        0
--------------------------------------------------

Benchmark leetcode
# of formulae: 2652
--------------------------------------------------
tool                                ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-1cae537-17ffaf6        2649     3   52.31   0.02   0.01   0.06    865     1784          0     0         3        0
z3-noodler-f44faf4-17ffaf6        2648     4   52.81   0.02   0.01   0.05    864     1784          0     0         4        0
z3-noodler-model-1cae537-17ffaf6  2642    10   48.03   0.02   0.01   0.02    858     1784          0     0         0       10
z3-noodler-model-f44faf4-17ffaf6  2641    11   47.91   0.02   0.01   0.02    857     1784          0     0         0       11
cvc5-1.2.0                        2652     0   36.00   0.01   0.00   0.03    867     1785          0     0         0        0
z3-4.13.0                         2652     0   41.89   0.02   0.01   0.11    867     1785          0     0         0        0
--------------------------------------------------

Benchmark str_small_rw
# of formulae: 1880
--------------------------------------------------
tool                                ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-1cae537-17ffaf6        1814    66  400.52   0.22   0.01   3.15      4     1810          6    35        25        0
z3-noodler-f44faf4-17ffaf6        1815    65  443.10   0.24   0.01   3.27      4     1811          6    38        21        0
z3-noodler-model-1cae537-17ffaf6  1811    69  161.17   0.09   0.01   0.92      4     1807          6    48         0       15
z3-noodler-model-f44faf4-17ffaf6  1811    69  191.59   0.11   0.02   1.13      3     1808          7    50         0       12
cvc5-1.2.0                        1860    20   34.43   0.02   0.00   0.62     19     1841          2    18         0        0
z3-4.13.0                         1821    59  202.53   0.11   0.01   2.66     25     1796          0    59         0        0
--------------------------------------------------

Benchmark pyex
# of formulae: 23845
--------------------------------------------------
tool                                 ✅    ❌      time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------------  -----  ----  --------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-1cae537-17ffaf6        23754    91   4317.89   0.18   0.10   1.26  20383     3371          0    91         0        0
z3-noodler-f44faf4-17ffaf6        23752    93   4278.47   0.18   0.10   1.14  20381     3371          0    93         0        0
z3-noodler-model-1cae537-17ffaf6  23750    95   4019.24   0.17   0.10   0.71  20379     3371          0    95         0        0
z3-noodler-model-f44faf4-17ffaf6  23744   101   3997.50   0.17   0.10   0.72  20373     3371          0   101         0        0
cvc5-1.2.0                        23820    25   6002.12   0.25   0.07   2.51  20400     3420          0    25         0        0
z3-4.13.0                         22895   950  55228.54   2.41   0.12  10.38  19624     3271          0   950         0        0
--------------------------------------------------

Benchmark full_str_int
# of formulae: 16968
--------------------------------------------------
tool                                 ✅    ❌      time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------------  -----  ----  --------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-1cae537-17ffaf6        16816   152   1374.70   0.08   0.01   1.56   2495    14321        117    35         0        0
z3-noodler-f44faf4-17ffaf6        16818   150   1572.33   0.09   0.01   1.86   2497    14321        117    33         0        0
z3-noodler-model-1cae537-17ffaf6  16812   156    921.87   0.05   0.01   0.46   2491    14321        117    39         0        0
z3-noodler-model-f44faf4-17ffaf6  16813   155    967.34   0.06   0.01   0.54   2492    14321        117    38         0        0
cvc5-1.2.0                        16963     5   5072.37   0.30   0.01   1.39   2521    14442          0     5         0        0
z3-4.13.0                         16732   236  18620.60   1.11   0.01   6.38   2476    14256          0   236         0        0
--------------------------------------------------

Benchmark snia
# of formulae: 70
--------------------------------------------------
tool                                ✅    ❌    time     avg     med     std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------------  ----  ----  ------  ------  ------  ------  -----  -------  ---------  ----  --------  -------
z3-noodler-1cae537-17ffaf6          70     0    0.75    0.01    0.01    0.00     70        0          0     0         0        0
z3-noodler-f44faf4-17ffaf6          70     0    0.77    0.01    0.01    0.00     70        0          0     0         0        0
z3-noodler-model-1cae537-17ffaf6    70     0    0.73    0.01    0.01    0.00     70        0          0     0         0        0
z3-noodler-model-f44faf4-17ffaf6    70     0    0.84    0.01    0.01    0.00     70        0          0     0         0        0
cvc5-1.2.0                          70     0    0.03    0.00    0.00    0.00     70        0          0     0         0        0
z3-4.13.0                            0    70    0.00  nan     nan     nan         0        0          0     0         0       70
--------------------------------------------------

Note the reduction of unknowns between z3-noodler-model-f44faf4-17ffaf6 and z3-noodler-model-1cae537-17ffaf6, those were the formulae where we thought that inclusion graph contains cycle but it was just caused by preprocessing. There are only 3-4 formulae where we fail with model generation because of cycle now (and they are real non-chain-free formulae):

QF_S/20240318-omark/noodles-unsat-10.smt2
QF_SLIA/20230329-woorpje-lu/track05/05_track_169.smt2
QF_SLIA/20230329-woorpje-lu/track05/05_track_25.smt2
QF_SLIA/20230403-webapp/lan-rep-all/lan_replace_all100.smt2

The last one might fail because of replace_all operator and not because it is not chain-free.

jurajsic commented 1 week ago

The scatter plot: image The old version is down.

jurajsic commented 1 week ago

Good catch with the if-else, there was a degradation, newest results:

# of formulae: 103405
--------------------------------------------------
tool                            ✅    ❌       time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ------  ----  ---------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-b7e4646-17ffaf6  102442   963   12295.38   0.12   0.02   1.70  62953    39489        530   372        61        0
z3-noodler-1cae537-17ffaf6  102432   973   12422.91   0.12   0.02   1.74  62943    39489        530   379        64        0
z3-noodler-f44faf4-17ffaf6  102439   966   12031.70   0.12   0.02   1.60  62952    39487        530   371        65        0
cvc5-1.2.0                   99841  3564   73861.19   0.74   0.01   6.42  60931    38910          2  3553         9        0
z3-4.13.0                    97745  5660  128712.47   1.32   0.01   7.95  58927    38818        211  4914       465       70
--------------------------------------------------

Comparison with if-else version: image Comparison with previous version: image

vhavlena commented 1 week ago

Nice, ready to merge for me.