Array domain - Githubissues

viktormalik commented 1 year ago

Introduce a new abstract domain for reasoning about contents of arrays.

Template form

Each (loop-back) array is split into multiple segments. All array items in a segment are abstracted with a single segment element variable for which we compute an invariant in one of the basic domains, based on the array elements type. Currently, interval and heap domains are supported for the inner domain.

Let us have an array variable a and two segment border indices i#1, i#2. If the indices can be ordered (i.e. i#1 < i#2), we create a segmentation:

{0} seg#0 {i#1} seg#1 {i#1+1} seg#2 {i#2} seg#3 {i#2+1} seg#4 {size of a}

If the indices cannot be ordered, we create two segmentations:

{0} seg#0 {i#1} seg#1 {i#1+1} seg#2 {size of a}
{0} seg#3 {i#2} seg#4 {i#2+1} seg#5 {size of a}

Template rows

All array elements from a segment seg#i are abstracted using a segment index variable idx#i and a segment element variable elem#i = a#lb[idx#i]. We let an inner domain to compute an invariant over elem#i using the domain's standard template, whose template row guards are extended with additional segment bounds. A template row for a segment {idx#low}seg#i{idx#high} has the form:

g#X && g#lsY && idx#i >= idx#low && idx#i < idx#high && elem#i = a#lb[idx#i] => T

where T is a template row over elem#i in the inner domain, e.g. elem#i <= CONST.

Choice of segment border indices

At the moment, we use two sources for the segment borders:

all indices written in the given loop - this allows to handle standard array iteration loops - the invariant typically states that some property holds for all elements that have already been visited
values of the above indices from before the loop - this allows to handle arrays that are not iterated from 0

The list of written indices is determined using the array index analysis which collects expressions occurring on the LHS (written) and RHS (read) of array accesses.

Invariant projection

In order to be able to use the invariants computed over the segment elements variables, we project the invariants onto all read indices of the given array (the set of read indices is determined using the mentioned array index analysis). The projection is done by copying the relevant part of the template and replacing the segment index variable by the read index. This makes sure that the constraint expressed by the invariant is used for the read index.

Combination with zones

Besides simple domains for a single variable (intervals, heap), the array domain can be combined with the zones domain. If both domains are used, a specific differences are created:

a difference between an array segment and a scalar if one of them is updated in a loop (see commit message of 96ef1174322a8ebcfed4161027b31b6b74bf3e1a for a detailed description) - allows to verify, e.g. loops computing an array maximum
a difference between an array element and its index - allows to reason about arrays whose element value is related to the element's index
a difference between segments of two arrays - allows to verify, e.g. array copying loops

Computation of array invariants takes considerably longer compared to other domains. The main reason is that there's a lot of constraints that the solver must count with (segment border constraints, invariant projection, ...). To make the template smaller, this introduces an expression dependence analysis which determines which expressions may depend (in a data pr control) way on each other. Then, only differences of potentially dependent expressions are created inside the array-zones combination.

Resolves #106.

Many currently handled programs are added as regressions tests. The test suite and clang-format should pass after every commit.

An important remaining feature is generation of witnesses from array invariants but I'd submit that as a separate PR.

viktormalik commented 1 year ago

@FrNecas you might want to have a look.

peterschrammel commented 1 year ago

@viktormalik, is this ready to merge?

viktormalik commented 1 year ago

Ready now. We should do a release after the merge.

peterschrammel commented 1 year ago

Ready now. We should do a release after the merge.

Can you please create a PR that bumps the version numbers?

diffblue / 2ls

Array domain #167

Template form

Template rows

Choice of segment border indices

Invariant projection

Combination with zones