Closed michael-schwarz closed 3 years ago
From the results table, this is with octApron that's currently on master (https://github.com/goblint/analyzer/commit/a06a611c), not traces-relational.
Apparently we have an octApron regression test that doesn't actually work with octApron, but is based on signextension-1: https://github.com/goblint/analyzer/blob/847352e74a1f388f5a23074dd9bd16c47f85c547/tests/regression/36-octapron/08-problem-signed-unsigned-overflow.c.
From the results table, this is with octApron that's currently on master (a06a611), not traces-relational.
Yes, probably easier to first fix these bugs as I guess they are easier to debug than those that depend on privatization.
Those that fail with Exception(Failure)
are also related to octApron directly it seems;
Fatal error: exception Failure("Environment.dim_of_var: unknown variable in the environment")
I looked at the first Apron Manager exception (missing variable in environment) one and it boils down to this:
double copysign_double(double x, double y) {
return x;
}
int main() {
// oct is top here, so entry state of copysign_double is also top (doesn't contain x')
copysign_double(0, 0); // works when combine has no lval
// double z = copysign_double(0, 0); // doesn't work when combine has lval
}
Many of the Failure exceptions might be for related reasons, just on different operations, which raise a different exception.
The analysis (already from poly) contains some logic for function calls, which introduces primed variables for formal arguments and tries to relate the return value to those I guess. I'm not fully sure what the logic here is or if it works, so I wanted @ivanazuzic to look into it. It seems like some kind of transformer for function summarization?
What's also a bit unusual about function calls is that the contexts are all or nothing:
https://github.com/goblint/analyzer/blob/cdbfd67d59c4112f8f09bd22b1f4d31e37432aac/src/analyses/apron/octApron.apron.ml#L19
Not sure how this fits into the rest, because my intuition for them would be to just remove the caller's locals and pass on the rest. This likely needs to be looked over, because for traces-relational, we would want to pass the Priv
module state through functions.
Also most of these exceptions seem to come from floating point tasks, which we really don't care about right now (although these function call things might also pop up for integers I suppose?). Currently octApron does some stuff with floating point (because Apron supports it), but I think we also disabled it via option (maybe that's the cause for the exceptions).
In traces-relational I actually totally ignore (and probably break) floating point variables right now, because they're annoying to deal with (Apron environments keep separate int and float vars). I'm tempted to say we should completely rip the float vars support out of it to simplify things. Especially since the description of octApron explicitly mentions just integers: https://github.com/goblint/analyzer/blob/cdbfd67d59c4112f8f09bd22b1f4d31e37432aac/src/analyses/apron/octApron.apron.ml#L1
I'm tempted to say we should completely rip the float vars support out of it to simplify things.
At least for now, I 100% agree. Looking towards SVCOMP'22, we might want to add some support back later.
At least for now, I 100% agree. Looking towards SVCOMP'22, we might want to add some support back later.
Sure, but such support would probably have to start with the non-relational support in base anyway. Especially since Apron doesn't seem to be aware of NaN possibility, which is why we disabled floats anyway.
It's possible some issues are already fixed on traces-relational
as well (e.g., https://github.com/goblint/analyzer/commit/749491625a474638297904b9886f7744c2f7ac48), so when trying to fix these, it makes sense to try that branch as well if there's already a partial fix to cherry-pick.
What's also a bit unusual about function calls is that the contexts are all or nothing: https://github.com/goblint/analyzer/blob/cdbfd67d59c4112f8f09bd22b1f4d31e37432aac/src/analyses/apron/octApron.apron.ml#L19
That line about context was changed by Ralf here
Context used to be bottom and he made it depend on exp.full-context
.
I guess it might be fine. The consequence is that if a function is called with different arguments x < y
and x > y
, then they won't get separate contexts (both use bottom octagon), so the cases get joined together. It might matter in some case?
Anyway, I think most of the Apron manager exceptions (and maybe Failure exceptions) might be gone now after #253.
I spent the better part of Friday looking at why this super simple benchmark https://github.com/sosy-lab/sv-benchmarks/blob/99d37c5b4072891803b9e5c154127c912477f705/c/loop-acceleration/simple_2-1.c#L13-L21 went from 0.0785s to timing out after 60s. It turned out to be a pretty wild ride.
I quickly realized that this timeout with octApron only happened in SV-COMP mode, which was a bad sign. Tracing the non-terminating analysis revealed that the wonderful dependency sets in PathSensitive3
were just growing and growing, by having seemingly duplicate values in a (Hoare) set. So I fully expected this polymorphic compare
of OctApronDomain
to be the issue:
https://github.com/goblint/analyzer/blob/b5e988d75476dc8316bdcec7d7a1cdebb402f7b5/src/cdomains/apron/octApronDomain.apron.ml#L101
And there's no A.compare
from Apron...
So I tried some weird implementations for it like:
let compare x y =
if equal x y then
0
else
Stdlib.compare x y
And it didn't change a thing, even though equal
is properly implemented there (including for our special bot
and top
, which the problematic duplicate value contained). This made no sense (the duplicate elements should have collapsed by this), so in a moment of desperation I changed the implementation to failwith "asd"
... and it never raised the exception despite the fact that PathSensitive3
clearly is relying on compare
!
Then I tried disabling the HashconsLifter
andHashconsContextLifter
and noticed that it only reached the failure when I disabled at least one of the two lifters. This made sense because hashconsing redefines compare
using its own unique tag
s, so if both are hashconsed, then the inner compare
doesn't really matter. As long as hash
and equal
behave correctly... and they did (both implemented using Apron's respective functions).
Hashconsing strikes again! So I used its equal_debug
and saw this output about equal elements having different tags:
https://github.com/goblint/analyzer/blob/b5e988d75476dc8316bdcec7d7a1cdebb402f7b5/src/domains/printable.ml#L155-L156
AFAIK this is something that's supposed to be related to loading the marshaled results for incremental analysis (and relift
) but I was doing neither, so how could hashconsing be wrong then? I started to suspect the worst: the Weak
memory management stuff inside hashconsing doesn't work properly with the memory management of Apron in C or something crazy like that.
I kept adding more and more debug prints to see what's going on around hashconsing and realized the following: the abstract value during compare
was different from the abstract value that was initially lifted into hashconsing to get the tag. So the failing debug equal
comparison saw two equal abstract values, but they weren't the abstract values that were lifted. The abstract value had been mutated in-place!
Eventually I found the place where this happened: https://github.com/goblint/analyzer/blob/b5e988d75476dc8316bdcec7d7a1cdebb402f7b5/src/cdomains/apron/octApronDomain.apron.ml#L567-L568
The _with
version of the Apron function mutates the abstract value in place, which is here then also returned. But this also mutates the abstract value given as argument to this transfer function and that happened to be physically equal to the abstract value stored for hashconsing. So a follow-up assign caused the hashconsed value to change and break everything.
The issue is now fixed on master (https://github.com/goblint/analyzer/commit/15b7b8d741fdc742879f7b70aa9f7dd7f2207517) and that particular SV-COMP benchmark with octApron terminates properly now. So the moral of the story is: imperative programming is the root of all evil.
What is strange is that after the fixes introduced to master for octApron, the performance on sv-comp test decreased by quite a lot: 173 test where we had the correct verdict before now yield EXCEPTION (Apron.Manager.Error)
. :thinking:
On the bright side, the number of timeouts decreased by quite a bit.
A diff of a run without octagons, with the octagons in ga06a611c, and with the octagons on current master. comparison.zip
What is strange is that after the fixes introduced to master for octApron, the performance on sv-comp test decreased by quite a lot: 173 test where we had the correct verdict before now yield
EXCEPTION (Apron.Manager.Error)
. thinking
Thanks for doing an updated SV-COMP run! I'll look into those. I think it's possible that the mutation also caused unsoundness by changing the stored octagon at some other node and then didn't cover some possible abstract executions that reveal other problems with Apron environment mismatches.
EDIT: It seems to be another hashconsing-mutability issue somehow, because when __VERIFIER_nondet_uint
is called, octApron gives it a top entry state in an empty environment, but when the side effect reaches the solver then the state is top over cond'
, which corresponds to a formal argument of __VERIFIER_assert
.
EDIT 2: Ahh! The OctApronDomain
has an identity crisis (pun intended!): equal
is defined through is_top
https://github.com/goblint/analyzer/blob/db84904d27d11c5a5d05183eea77c40785813c23/src/cdomains/apron/octApronDomain.apron.ml#L88-L93
which considers the top value with any environment to be top
https://github.com/goblint/analyzer/blob/db84904d27d11c5a5d05183eea77c40785813c23/src/cdomains/apron/octApronDomain.apron.ml#L37-L40
but the actual top
is defined to be top with the empty environment. These things don't match up.
So top (env: [||])
is considered equal to top (env: [|0> cond':int|])
by both equal
(which I didn't change) and hash
(which I changed to match equal
, otherwise the hashing correctness condition is violated). Therefore hashconsing equates them and gives the value with the wrong environment in some places.
@michael-schwarz Can you do an SV-COMP run from this branch: https://github.com/goblint/analyzer/tree/octapron-manager-error? I attempted to fix the weird implicitly lifted bot
and top
by giving them a special unique environment and (hopefully) making all comparisons sensitive to the environment.
Sure!
@sim642 This seems to resolve all these issues! comparison-fix-manager.zip
Great! I merged the branch into master.
I noticed that besides the unsoundness issues mentioned in the issue, there's also a bunch of suspicious cases of TIMEOUTs turning into unknown, especially in float-newlib. My hope is that they're also doing some integer casting, which we analyze unsoundly right now, but we'll see if they remain after the unsoundness fixes.
I just made a quick fix (on traces-relational) to the first three unsound cases by checking the ranges of cast types in octApron: https://github.com/goblint/analyzer/commit/4af0c8e9853681879a6e569f147ab882fc095cae.
The fourth one was already fixed on traces-relational. I've done so much refactoring and fixing of octApron there, that I probably fixed it indirectly at some point.
I'll keep these unsoundness fixes on traces-relational, but would soon also like to merge the current part of that branch back to master to prevent octApron from deviating so much.
The surprising time improvement on float-newlib didn't change due to the casting fix, but still seemed suspicious. Indeed there was an unsound reason for it: at one point octApron got the ikind without unrolling the typedefs in the type. Having not done that, it fell into a no-op case, where an assignment (left-hand side invalidation) was ignored: https://github.com/goblint/analyzer/commit/d2b7948fce1e7bbad259d779151b72ae6f470c53?w=1.
These should be fixed after merging #291.
I ran SV-Comp with OctApron, for the following tests we reported true contrary to how the task is labelled:
The base analysis does not have that problem.
Apart from that the changes are:
I'll link to the result in a comment on Slack, GitHub doesn't let me add HTML files.