Open michael-schwarz opened 2 years ago
A first run on these benchmarks (merging of inlines enabled) is thoroughly disappointing:
Goblint runs for approx 3.5h on this benchmark, gobbling up ~140GB of RAM
vars: 331270, evals: 498201
(with 17245 contexts for 960 functions. Top 5 functions:
)
An overwhelming majority of the code is dead once again
Found dead code on 394320 lines (including 393571 in uncalled functions)!
Total lines (logical LoC): 410450
We seem to produce so many warnings that trying to write them out in json produces a stack overflow in List.map
Global initializers take a non-trivial amount of time (Initializing 10815 globals. Executing 552721 assigns.
)
Timings:
TOTAL 103.599 s
parse 46.593 s
convert to CIL 57.006 s
analysis 0.000 s
makeCFG 1.833 s
connect 3.710 s
global_inits 3893.714 s
solving 7973.467 s
S.Dom.equal 0.423 s
postsolver 2942.472 s
warn_global 0.451 s
access 0.413 s
Race Warnings:
Summary for all memory locations:
safe: 3359
vulnerable: 433
unsafe: 2091
-------------------
total: 5883
76 instances of [Warning][Unknown] Unknown function ptr called
- We seem to produce so many warnings that trying to write them out in json produces a stack overflow in
List.map
We should use BatList.map
whereever we have possibly large lists. That is implemented using constant stack (and without reversal!), unlike Stdlib.List.map
, which is just naive.
The stackoverflow happens inside
Yojson.Safe.pretty_to_channel ~std:true out json
(at https://github.com/goblint/analyzer/blob/36a5c5d20e75c06c979a46996c202ed7384ec0ef/src/framework/analyses.ml#L278). So just using BatList.map would not cut it as the call happens inside a library. But I guess this a problem for the future once we actually manage to analyze FFmpeg and get meaningful results.
The stackoverflow happens inside
Yojson.Safe.pretty_to_channel ~std:true out json
(at https://github.com/goblint/analyzer/blob/36a5c5d20e75c06c979a46996c202ed7384ec0ef/src/framework/analyses.ml#L278). So just using BateList.map would not cut it as the call happens inside a library.
Do you have a full stack trace of where inside the library? Might be worth reporting or fixing.
Other than that, maybe it's just specific to pretty_to_channel
. Could be that the minimal output of to_channel
doesn't suffer from the same issue.
I ran another run with the fix for non privatization and ctx insens base:
Timings:
TOTAL 106.970 s
parse 47.924 s
convert to CIL 59.046 s
analysis 0.000 s
makeCFG 2.077 s
connect 4.217 s
global_inits 268.657 s
solving 99596.327 s
S.Dom.equal 5.553 s
postsolver 1454.500 s
warn_global 1.611 s
access 1.500 s
Timing used
Memory statistics: total=203083937.10MB, max=35359.30MB, minor=203055616.41MB, major=3504974.32MB, promoted=3476653.63MB
minor collections=96826212 major collections=462 compactions=0
So we land at around 27h and 35GB of RAM here. We once again have the issue with the stackoverflow inside the json-messages
output.
Another observation is that global inits is now a lot faster, presumably because of small local states?
Also what is a little disturbing is that despite all of this and the analysis now taking considerably longer, most code is still dead (it dead in fact not change at all):
Found dead code on 393400 lines (including 392979 in uncalled functions)!
Total lines (logical LoC): 410495
FFmpeg (https://github.com/FFmpeg/FFmpeg)
./configure --disable-x86asm --disable-ffplay --disable-ffprobe --disable-inline-asm --disable-doc
compiledb make
int i, ret, __attribute__((unused))(version), nb_curves;
(https://github.com/goblint/cil/issues/76) and https://github.com/goblint/cil/issues/78./goblint ../../bench-repos/FFmpeg --disable ana.base.context.non-ptr --disable sem.unknown_function.spawn --set ana.thread.domain plain --enable exp.earlyglobs --set ana.base.privatization none --enable dbg.print_dead_code --set cppflags[+] -DGOBLINT_NO_BSEARCH --set cppflags[+] -DGOBLINT_NO_ASSERT --set cppflags[+] -UHAVE_INLINE_ASM --set result json-messages -v
fftools/ffmpeg.c:4825: Warning: def'n of func main at fftools/ffmpeg.c:4825 (sum 10757610) conflicts with the one at doc/print_options.c:121 (sum 2656015); keeping the one at doc/print_options.c:121.
doc/print_options.c
from compilation database