goblint / analyzer

Static analysis framework for C
https://goblint.in.tum.de
MIT License
160 stars 72 forks source link

Activate abortUnless in svcomp conf #1464

Closed sim642 closed 1 month ago

sim642 commented 1 month ago

This is on top of #1462 to fix #1453, but the following benchmarking was also done with #1450 included.

In https://github.com/goblint/analyzer/pull/875#issuecomment-1302204561, when the abortUnless analysis was added, it didn't seem to pay off. Now with Apron and autotuning, it seems that it might now.

sv-benchmarks no-overflow

With 60s timeout, we gain 92 new correct trues. There are also 3 new TIMEOUTs to be investigated.

CPU time linear scale

Always activating abortUnless has ~1% overhead, which is negligible. image

CPU time log scale

Visible differences are only at the low end and the slowdown cases are precisely those nla-digbench tasks where we become more precise because of this, which is expected. image

TODO

New TIMEOUTs to investigate:

sim642 commented 1 month ago

The timeouts are related to __VERIFIER_assert also being identified as an abortUnless function. Apron analysis refining according to that somehow breaks things.

sim642 commented 1 month ago

The recursified_knuth timeout is acceptable. It is caused by new refinement from assume_abort_if_not(a > 0); which somehow causes an explosion of contexts. This happens even with just def_exc, no intervals or anything of infinite height is needed. Tracing shows contexts increasing like Not {160}, ..., Not {240}, etc. So one exclusion in the context leads to a different exclusion in the recursive call.

This is a more fundamental problem that's not caused by abortUnless. Context widening fixes the timeout, but our svcomp conf doesn't use it (probably for more precision on recursive tasks that finish in time).

michael-schwarz commented 1 month ago

Context widening fixes the timeout, but our svcomp conf doesn't use it (probably for more precision on recursive tasks that finish in time).

This might the sort of thing where the context gas ⛽ may pay off.

michael-schwarz commented 1 month ago

What's the holdup here? If no one else wants to review it, I'd suggest we merge it.

sim642 commented 1 month ago

The holdup is that this depends on #1462 to prevent everything from just crashing. And that in itself depends on #1470 being in place first to avoid other regressions.