goblint / analyzer

Static analysis framework for C
https://goblint.in.tum.de
MIT License
160 stars 72 forks source link

Implement delayed widening #1483

Open RonaldJudin opened 1 month ago

RonaldJudin commented 1 month ago

This implements standard widening delay in-domain as suggested by https://github.com/goblint/analyzer/pull/1442#issuecomment-2092763639 and https://github.com/goblint/analyzer/pull/1442#issuecomment-2092998075. The two separate options and analysis lifters allow different delays to be used for locally and globally. Moreover, local widening delays are applied per-path, which is not possible at the solver level at all.

The domain functor WideningDelay.Dom is generic and could be used as fine-grained as desired. For example, a specific analysis could only apply it to its own local or global abstract values, not for all analyses simultaneously. It could even be applied to only some parts of composed abstract domains. It lives in a completely separate file, not to clutter any existing solvers, domains or analysis lifters.

TODO

sim642 commented 1 month ago
  • This, as we had discussed in Munich previously, avoids propagating high gas values outside of loops, which might lead to precision loss.

Was the precision loss about high gas values outside of loops or their avoidance (which is the opposite)?

  • Widening is only applied after the solver detects things as wpoints and thereby relies on the solver performing a widening between the previous value and the new one, rather than, e.g., performing widening between contributions from incoming edges.

What do you mean by widening between incoming edges? #1442 is in the solver and doesn't even know about different incoming edges, it can only see the joined values, so what's the difference?

What irks us a little is that this somehow mashes together the question of specifying the constraint system of interest with the process of computing (post-)solutions, which is evidenced by the fact that the gas components are not checked when performing leq checks. This information thus strikes us more to be a self-observation of the solver than something that ought to be reflected in the specification.

It's not the specification though. The lifted abstract values behave exactly as the unlifted ones, which is why leq shouldn't be affected. Widening by its very concept is something that affects the computation process, rather than its soundness, etc, so this isn't putting anything new in there. By that logic widenings shouldn't be in domains at all.

To go to extremes to make the point clear, putting stable or point into the constraint system, would - while technically possible - be a questionable choice.

To go to the other extreme, putting widen into the constraint system is also a questionable choice. But we aren't going to either extremity here, so that's really besides the point. By the way, this questionable choice has been proposed from TUM before: https://link.springer.com/chapter/10.1007/978-3-642-38088-4_12.


I have to emphasize, this is the standard way to implement widening delays. It's precisely what Astree does:

We now use a local solution. Each piece of abstract information (such as an interval, an octagon, a filter predicate, etc.) is fitted with a freshness counter that allows regulating when this piece of information is widened. [...] The main advantage of freshness counters is that each piece of abstract information is dealt with separately and they are not necessarily widened at the same time. This is very important to analyze some variables that depend on each others. [...] It is also possible to select the widening frequency on a per abstract domain basis.

In an earlier paper, they explicitly mention switching to this as opposed to some more global counting strategy:

The following approach supersedes the previous one: Each abstract property (or set of abstract properties) is fitted with a freshness indicator.

This is also what Frama-C does: https://git.frama-c.com/pub/frama-c/-/blob/fe41a5490c3933fe18b9dbd4075f2f03f926714a/src/plugins/eva/partitioning/trace_partitioning.ml#L63.

This and #1442 aren't mutually exclusive. This just adds to Goblint what everyone else already has and has it completely in a separate file, not bothering any existing solver, domain or analysis code to do so.


This implementation is very much in alignment how various things in Goblint already are:

michael-schwarz commented 1 month ago

Was the precision loss about high gas values outside of loops or their avoidance (which is the opposite)?

The consideration was that not resetting the counter would lead to precision loss at the code fragments located after a widening point.

What do you mean by widening between incoming edges? https://github.com/goblint/analyzer/pull/1442 is in the solver and doesn't even know about different incoming edges, it can only see the joined values, so what's the difference?

This observation was not about a difference between #1442 and this. Rather it was an observation about what the solvers need to be fulfill for this to work (which TD3 does, but not necessarily all solvers must do). #1442 is integrated into one specific solver, so the question about what a solver must be like for it to work doesn't pose itself there.

michael-schwarz commented 1 week ago

@Red-Panda64 has some benchmarks here, where this seems to be worse when it comes to performance then #1442. Those might be of interest.

Red-Panda64 commented 6 days ago

Well, yes, widening delay seems to be a bit slower. It times out 141 times where #1442 does not. Conversely, #1442 times out 10 times where widening delay does not. Here are the plots comparing the unreach-call and valid-memsafety categories, where the majority of the difference was found.

1442 is on the x-axis, #1483 on the y-axis.

unreach-call: x-widening-gas-y-widening-delay-unreach

valid-memsafety: x-widening-gas-y-widening-delay-memsafety

The benchmarks ran on the same machine with specs: Intel Xeon Platinum 8260 CPU @ 2.40GHz, cores: 96, frequency: 3900 MHz, Turbo Boost: enabled; RAM: 540649 MB However, @stilscher said that bench-exec was slightly differently configured.

Anyway, it is good to see that both implementations agree on everything and produce the same new true results.