diffblue / 2ls

Static Analyzer and Verifier
http://www.cprover.org/2LS
Other
43 stars 22 forks source link

Array domain #167

Closed viktormalik closed 1 year ago

viktormalik commented 1 year ago

Introduce a new abstract domain for reasoning about contents of arrays.

Template form

Each (loop-back) array is split into multiple segments. All array items in a segment are abstracted with a single segment element variable for which we compute an invariant in one of the basic domains, based on the array elements type. Currently, interval and heap domains are supported for the inner domain.

Let us have an array variable a and two segment border indices i#1, i#2. If the indices can be ordered (i.e. i#1 < i#2), we create a segmentation:

{0} seg#0 {i#1} seg#1 {i#1+1} seg#2 {i#2} seg#3 {i#2+1} seg#4 {size of a}

If the indices cannot be ordered, we create two segmentations:

{0} seg#0 {i#1} seg#1 {i#1+1} seg#2 {size of a}
{0} seg#3 {i#2} seg#4 {i#2+1} seg#5 {size of a}

Template rows

All array elements from a segment seg#i are abstracted using a segment index variable idx#i and a segment element variable elem#i = a#lb[idx#i]. We let an inner domain to compute an invariant over elem#i using the domain's standard template, whose template row guards are extended with additional segment bounds. A template row for a segment {idx#low}seg#i{idx#high} has the form:

g#X && g#lsY && idx#i >= idx#low && idx#i < idx#high && elem#i = a#lb[idx#i] => T

where T is a template row over elem#i in the inner domain, e.g. elem#i <= CONST.

Choice of segment border indices

At the moment, we use two sources for the segment borders:

The list of written indices is determined using the array index analysis which collects expressions occurring on the LHS (written) and RHS (read) of array accesses.

Invariant projection

In order to be able to use the invariants computed over the segment elements variables, we project the invariants onto all read indices of the given array (the set of read indices is determined using the mentioned array index analysis). The projection is done by copying the relevant part of the template and replacing the segment index variable by the read index. This makes sure that the constraint expressed by the invariant is used for the read index.

Combination with zones

Besides simple domains for a single variable (intervals, heap), the array domain can be combined with the zones domain. If both domains are used, a specific differences are created:

Computation of array invariants takes considerably longer compared to other domains. The main reason is that there's a lot of constraints that the solver must count with (segment border constraints, invariant projection, ...). To make the template smaller, this introduces an expression dependence analysis which determines which expressions may depend (in a data pr control) way on each other. Then, only differences of potentially dependent expressions are created inside the array-zones combination.

Resolves #106.

Many currently handled programs are added as regressions tests. The test suite and clang-format should pass after every commit.

An important remaining feature is generation of witnesses from array invariants but I'd submit that as a separate PR.

viktormalik commented 1 year ago

@FrNecas you might want to have a look.

peterschrammel commented 1 year ago

@viktormalik, is this ready to merge?

viktormalik commented 1 year ago

Ready now. We should do a release after the merge.

peterschrammel commented 1 year ago

Ready now. We should do a release after the merge.

Can you please create a PR that bumps the version numbers?