VeriFIT / z3-noodler

The Z3-Noodler String Solver
Other
9 stars 5 forks source link

Small cleaning in `final_check_eh` #152

Closed jurajsic closed 4 months ago

jurajsic commented 5 months ago

This PR

jurajsic commented 5 months ago

Full results:

# of formulae: 103388
--------------------------------------------------
tool                            ✅    ❌       time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ------  ----  ---------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-2991234-d95fe13  102326  1062   13411.99   0.13   0.02   1.58  62899    39427        625   387        50        0
z3-noodler-69644b5-d95fe13  102301  1087   11661.04   0.11   0.01   1.57  62875    39426        625   411        51        0
cvc5-1.1.2                   99769  3619   76478.24   0.77   0.00   6.77  60868    38901          2  3608         9        0
z3-4.13.0                    97687  5701  132036.99   1.35   0.01   8.11  58874    38813        210  4960       461       70
--------------------------------------------------

image

Selected interesting benchmarks:

Benchmark pyex
# of formulae: 23845
--------------------------------------------------
tool                           ✅    ❌      time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  -----  ----  --------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-2991234-d95fe13  23744   101   6147.38   0.26   0.13   1.66  20373     3371          0    99         2        0
z3-noodler-69644b5-d95fe13  23754    91   4670.05   0.20   0.10   1.81  20383     3371          0    89         2        0
cvc5-1.1.2                  23826    19   6384.00   0.27   0.07   2.48  20406     3420          0    19         0        0
z3-4.13.0                   22858   987  54817.37   2.40   0.12  10.23  19587     3271          0   987         0        0
--------------------------------------------------

Benchmark full_str_int
# of formulae: 16968
--------------------------------------------------
tool                           ✅    ❌      time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  -----  ----  --------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-2991234-d95fe13  16810   158   1732.24   0.10   0.01   1.32   2497    14313        125    33         0        0
z3-noodler-69644b5-d95fe13  16776   192   1570.36   0.09   0.01   1.31   2463    14313        125    67         0        0
cvc5-1.1.2                  16963     5   5095.10   0.30   0.01   1.42   2521    14442          0     5         0        0
z3-4.13.0                   16729   239  20278.49   1.21   0.01   6.90   2473    14256          0   239         0        0
--------------------------------------------------
jurajsic commented 5 months ago

It seems there is significant slow down for pyex, but I do not know why.

vhavlena commented 4 months ago

Did you find out what happened? The effect on Pyex is not negligible.

jurajsic commented 4 months ago

Did you find out what happened? The effect on Pyex is not negligible.

Looks like loop protection did something also for sat instances, which is weird, so I changed it to a flag - if we returned sat once, we immediately return always sat again. Seems like it works correctly, might be problem for model generation, but I will sort it when the problem arises.

The results

# of formulae: 103388
--------------------------------------------------
tool                            ✅    ❌       time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ------  ----  ---------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-af1b8c8-d95fe13  102340  1048   10940.96   0.11   0.02   1.54  62914    39426        625   373        50        0
z3-noodler-69644b5-d95fe13  102301  1087   11661.04   0.11   0.01   1.57  62875    39426        625   411        51        0
cvc5-1.1.2                   99769  3619   76478.24   0.77   0.00   6.77  60868    38901          2  3608         9        0
z3-4.13.0                    97687  5701  132036.99   1.35   0.01   8.11  58874    38813        210  4960       461       70
--------------------------------------------------

image

Intereseting benchmarks:

Benchmark pyex
# of formulae: 23845
--------------------------------------------------
tool                           ✅    ❌      time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  -----  ----  --------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-af1b8c8-d95fe13  23756    89   4175.03   0.18   0.08   1.72  20385     3371          0    87         2        0
z3-noodler-69644b5-d95fe13  23754    91   4670.05   0.20   0.10   1.81  20383     3371          0    89         2        0
cvc5-1.1.2                  23826    19   6384.00   0.27   0.07   2.48  20406     3420          0    19         0        0
z3-4.13.0                   22858   987  54817.37   2.40   0.12  10.23  19587     3271          0   987         0        0
--------------------------------------------------

Benchmark full_str_int
# of formulae: 16968
--------------------------------------------------
tool                           ✅    ❌      time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  -----  ----  --------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-af1b8c8-d95fe13  16811   157   1347.03   0.08   0.01   1.20   2498    14313        125    32         0        0
z3-noodler-69644b5-d95fe13  16776   192   1570.36   0.09   0.01   1.31   2463    14313        125    67         0        0
cvc5-1.1.2                  16963     5   5095.10   0.30   0.01   1.42   2521    14442          0     5         0        0
z3-4.13.0                   16729   239  20278.49   1.21   0.01   6.90   2473    14256          0   239         0        0
--------------------------------------------------