Closed sim642 closed 3 years ago
This seems to be related to merging files, because witness generation/SV-COMP mode of Goblint also analyzes includes/sv-comp.c
. Both seem to add the builtin functions again in CIL, but with somehow overlapping ID ranges.
If I also enable custom_libc
, then that analyzes includes/lib.c
as a third file, then there will be three copies of each builtin function.
Alright, I've figured out what exactly is happening:
Cabs2cil.environment
with those IDs.Cabs2cil.environment
with those IDs.Cabs2cil.environment
with those IDs (at least I think).Partial.globally_unique_vids
, which renumbers every used variable ID into [0, X]. (Even though we're about to remove Partial
, Goblint will have a copy of this simple renumbering functionality: https://github.com/goblint/analyzer/pull/194/files.)mutable
record field, the varinfo
s are physically equal to some of the EnvVar
data inside Cabs2cil.environment
, which through this secret connection is automagically updated to match the new IDs, even though we never go back to fix it to be consistent!But the copies of varinfo
that were ignored for constructing the merged file were not renumbered and their data inside Cabs2cil.environment
still has their original IDs in [0, C]. And some of those might just happen to overlap with the renumbered ones in [0, X], which causes the duplication.
I'm surprised that this issue only showed up now, because AFAIK it has been possible ever since #17 exposed the internal mapping and also in our svcomp21 build. It requires some extraordinary coincidence though: the order of things in a hashtable must happen to be such that the duplicates happen to be the wrong way around exactly for the variable relevant for a witness invariant.
I think the easiest solution is to just not do Partial.globally_unique_vids
or the like.
I think the easiest solution is to just not do
Partial.globally_unique_vids
or the like.
This had me worried a bit, since we do rely on having unique vids
in a couple of places. But then it seems like this enforced by CIL anyway by assigning increasing ids. So it seems like Partial.globally_unique_vids
only ensures that the vids
are consequent, which is not something we care about.
The only place where we would run into issues I guess is in loadBinaryFile
, but I don't think we use this in Goblint.
https://github.com/goblint/cil/blob/7181c1396efa8c423ba7ec8a946c7f3043ca8cfb/src/cil.ml#L5219-L5233
So I guess it is safe to remove globally_unique_vids
altogether.
This had me worried a bit, since we do rely on having unique
vids
in a couple of places. But then it seems like this enforced by CIL anyway by assigning increasing ids. So it seems likePartial.globally_unique_vids
only ensures that thevids
are consequent, which is not something we care about.
As far as I can see, this should be the case as long as all varinfo
s are appropriately created by the CIL functions. Goblint git history shows that Partial.globally_unique_vids
has been there since the beginning, so no additional context about why it was ever necessary.
The only place where we would run into issues I guess is in
loadBinaryFile
, but I don't think we use this in Goblint.
I also noticed this when I was looking through how CIL generates the vid
s, but I don't understand what's special about the number 11. It may be some ancient hard-coded relic, because CIL adds a lot (hundreds) of builtin function declarations. And globally_unique_vids
would renumber things independently of nextGlobalVID
, so it doesn't seem related anyway.
I'm just thinking now that maybe Cabs2cil.environment
should say somewhere that it exposes just the mapping for initially CIL-generated varinfo
s and their numbering.
Also #17 added and exposed Cabs2cil.varnameMapping
, which somehow also seems to be related to CIL renaming variables. But it's entirely based on variable names as strings and holds no connection to varinfo
s or their vid
s, so I thought it'd be safer to avoid trying to undo the renaming just on names (which don't necessarily have to be unique, right?) and use the more low-level environment instead. What's the intended connection between these things though?
It is indeed superfluous and I removed it in #32.
I started looking at https://github.com/goblint/analyzer/issues/195 and immediately found a witness generation issue that was unrelated to my
HoareMap
.Running the following generates
witness.graphml
where invariants use the variable name__builtin_bswap32
instead ofi
:Bisecting the issue leads me to this: https://github.com/goblint/analyzer/pull/169, but as far as I can see, that has nothing to do with the problem, which seems to be somewhat nondeterministic instead.
The variable names in witness generation undergo a replacement step, which replaces CIL's renamed stuff with their original values (https://github.com/goblint/analyzer/blob/419679cebdc9e667c8821783b988da05958176bc/src/domains/invariantCil.ml). This is based on #17.
If I do the following in Goblint to print out
Cabs2cil.environment
:Then I find out that the mapping contains
varinfo
s with different names but duplicatevid
s and that's the cause for the wrong variable name in the witness: