Open arkocal opened 3 days ago
Nice catch!
@sim642: I think for stringDomain.ml
this is simply an oversight from when we only had Disjoint
and Unit
? Or was there some reasoning behind it?
For the hashing in cdomains/intDomain.ml
, I guess this comes from https://github.com/sim642/ppx_deriving_hash? Maybe something can also be tweaked there? Iirc the example that @arkocal had, the hash for T
and Excluded{0}
were the same, which is maybe not ideal as both are quite common.
I think for
stringDomain.ml
this is simply an oversight from when we only hadDisjoint
andUnit
? Or was there some reasoning behind it?
That seems possible. All uses of get_string_domain ()
there are in if
s, not match
es, so it's error-prone indeed when something is added.
In
cdomains/intDomain.ml
, add to line 3680, afterlet to_yojson ...
let hash = Hashtbl.hash
This is wrong because some of the int domains (e.g. def_exc, enums) contain OCaml Set
s where structural equality and hashing lead to problems because sets don't have a canonical representation (equivalent sets may have different Hashtbl.hash
values).
So it may be fast, but wrong things can be arbitrarily fast.
Iirc the example that @arkocal had, the hash for
T
andExcluded{0}
were the same, which is maybe not ideal as both are quite common.
I just tried (e.g. in base's init
)
let i1 = ID.top_of IInt in
let i2 = ID.of_excl_list IInt [Z.zero] in
Logs.error "%d %d" (ID.hash i1) (ID.hash i2)
and this prints out different hash values [Error] -75963065849016984 39336082039394930
at 0deedb9ca2aeb11d9e6562161f57eee1908af498.
Maybe I also misremembered some of the details. @arkocal do you still have the example at hand?
I am not sure if the suggested hashes above are correct or performant. They are not meant to be solution suggestions, just illustrative examples for the source of collisions, sorry for the confusion.
Testing with dd_comb.c
from coreutils, without overwriting the hash for intDomain, I get the following:
%%% coll: Colliding variables:
L:node 34387 "prev = (uintmax_t )prev_nread;" on dd.c:918:11-918:40
and
L:node 34387 "prev = (uintmax_t )prev_nread;" on dd.c:918:11-918:40
%%% coll: Hash of a: 4529258937874703681,
Hash of b: 4529258937874703681
%%% coll: Values are equal: false
Coming from:
let interesting = 4529258937874703681 in
let colliding = HM.keys rho |> List.of_enum |> List.filter (fun k -> S.Var.hash k = interesting) in
match colliding with
| a::b::[] ->
if tracing then trace "coll" "Colliding variables:\n %a \nand\n %a" S.Var.pretty_trace a S.Var.pretty_trace b;
if tracing then trace "coll" "Hash of a: %d,\n Hash of b: %d" (S.Var.hash a) (S.Var.hash b);
if tracing then trace "coll" "Values are equal: %b" (S.Var.equal a b);
| _ -> ();
;
Since the string representations are the same, this might also be an issue with S.Var.equal
.
Enabling dbg.trace.context
would print out the entire unknown with the context. The difference hopefully is visible there.
This is very strange, trying:
let i1 = ID.top_of IInt in
let i2 = ID.of_excl_list IInt [Z.zero] in
Logs.error "%d %d" (ID.hash i1) (ID.hash i2)
Outputs
[Error] -75963065849016984 -75963065849016984
on my machine on 0deedb9ca2aeb11d9e6562161f57eee1908af498
With dbg.trace.context
the only diffence is in base.size
with T
vs. (Not {0}((0,64))
.
For completeness:
L:(node 34387 "prev = (uintmax_t )prev_nread;", [expRelation:(),
mallocWrapper:(wrapper call:Unknown node, unique calls:{}),
base:({
Parameter {
fd -> 0
buf___1 -> {?, &(alloc@sid:35301@tid:[main])[⊤], &(alloc@sid:35315@tid:[main])[⊤]}
size -> ⊤
}
}, {}, {}, {}),
threadid:(wrapper call:unknown node, Thread:[main], created:(current function:created (once) * created (multiple times):(created (once):{maybe_close_stdout@dd.c:2015:3-2015:31, interrupt_handler@dd.c:736:5-736:109, siginfo_handler@dd.c:729:5-729:110}, created (multiple times):{maybe_close_stdout@dd.c:2015:3-2015:31, interrupt_handler@dd.c:736:5-736:109, siginfo_handler@dd.c:729:5-729:110}), callees:bot)),
threadflag:Multithreaded (main),
threadreturn:false,
escape:{},
mutexEvents:(),
access:(),
mutex:(lockset:{}, multiplicity:{}),
race:(),
mhp:(),
assert:(),
pthreadMutexType:()]) on dd.c:918:11-918:40
and
L:(node 34387 "prev = (uintmax_t )prev_nread;", [expRelation:(),
mallocWrapper:(wrapper call:Unknown node, unique calls:{}),
base:({
Parameter {
fd -> 0
buf___1 -> {?, &(alloc@sid:35301@tid:[main])[⊤], &(alloc@sid:35315@tid:[main])[⊤]}
size -> (Not {0}([0,64]))
}
}, {}, {}, {}),
threadid:(wrapper call:unknown node, Thread:[main], created:(current function:created (once) * created (multiple times):(created (once):{maybe_close_stdout@dd.c:2015:3-2015:31, interrupt_handler@dd.c:736:5-736:109, siginfo_handler@dd.c:729:5-729:110}, created (multiple times):{maybe_close_stdout@dd.c:2015:3-2015:31, interrupt_handler@dd.c:736:5-736:109, siginfo_handler@dd.c:729:5-729:110}), callees:bot)),
threadflag:Multithreaded (main),
threadreturn:false,
escape:{},
mutexEvents:(),
access:(),
mutex:(lockset:{}, multiplicity:{}),
race:(),
mhp:(),
assert:(),
pthreadMutexType:()]) on dd.c:918:11-918:40
Do you maybe have different versions of ppx_deriving_hash
?
This would explain it.
I am getting the same values from a docker build as well. Looking at the Dockerfile it looks like my local state should not somehow creep into the container.
ppx_deriving_hash is not at fault because Goblint has a lower bound on the latest version. You couldn't even compile with a different version.
The problem is our fork of Zarith where we have version 1.12 with custom patches: https://github.com/goblint/Zarith/tree/goblint-1.12.
Meanwhile, Zarith 1.14 improved the hash functions (https://github.com/ocaml/Zarith/pull/150) but our fork hasn't been adapted. Thus, depending on Gobview opam pins, the hash function used for the zero in the exclusion set could be different. The unlucky case happens when Z.hash Z.zero = 0
.
That patch probably needs to be adapted if we want to benefit from Zarith improvements and keep Gobview.
With Zarith 1.14 I get the same values, and the few remaining conflicts have no obvious similarities.
Yeah, potentially GobView can be retired at some point --- currently no one seems interested in it (correct me if I'm wrong). Though it feels like a waste to have invested so much effort to then not get a paper out of it.
Summary: With about 60k variables encountered, we have about 1.04 entries per hash. With an ideal hash function the chances for even a single collision would have been very low.
The experiment can be repeated using the
topdown
solver onarkocal/hash_collision_demo
branch, which includes a minimal module for counting how often each hash value appears.In particular, variables that differ by a single string are mapped to the same hash.
The following changes already significantly reduce collisions (1.0008 keys per hash, no obvious similarities between colliding keys): In
cdomains/stringDomain.ml
(https://github.com/goblint/analyzer/blob/master/src/cdomain/value/cdomains/stringDomain.ml) :In
cdomains/intDomain.ml
, add to line 3680, afterlet to_yojson ...