FFmpeg - Githubissues

michael-schwarz commented 2 years ago

FFmpeg (https://github.com/FFmpeg/FFmpeg)

./configure --disable-x86asm --disable-ffplay --disable-ffprobe --disable-inline-asm --disable-doc
compiledb make
CIL issue: int i, ret, __attribute__((unused))(version), nb_curves; (https://github.com/goblint/cil/issues/76) and https://github.com/goblint/cil/issues/78
./goblint ../../bench-repos/FFmpeg --disable ana.base.context.non-ptr --disable sem.unknown_function.spawn --set ana.thread.domain plain --enable exp.earlyglobs --set ana.base.privatization none --enable dbg.print_dead_code --set cppflags[+] -DGOBLINT_NO_BSEARCH --set cppflags[+] -DGOBLINT_NO_ASSERT --set cppflags[+] -UHAVE_INLINE_ASM --set result json-messages -v
Remaining issue seems to be that the wrong main is picked fftools/ffmpeg.c:4825: Warning: def'n of func main at fftools/ffmpeg.c:4825 (sum 10757610) conflicts with the one at doc/print_options.c:121 (sum 2656015); keeping the one at doc/print_options.c:121.
- Manually remove doc/print_options.c from compilation database
Requires https://github.com/goblint/analyzer/pull/603 or turning on merging of inlines

michael-schwarz commented 2 years ago

A first run on these benchmarks (merging of inlines enabled) is thoroughly disappointing:

Goblint runs for approx 3.5h on this benchmark, gobbling up ~140GB of RAM
vars: 331270, evals: 498201 (with 17245 contexts for 960 functions. Top 5 functions:)

An overwhelming majority of the code is dead once again

Found dead code on 394320 lines (including 393571 in uncalled functions)!
Total lines (logical LoC): 410450

We seem to produce so many warnings that trying to write them out in json produces a stack overflow in List.map
Global initializers take a non-trivial amount of time (Initializing 10815 globals. Executing 552721 assigns.)

Timings:

TOTAL                          103.599 s
parse                          46.593 s
convert to CIL                 57.006 s
analysis                        0.000 s
makeCFG                         1.833 s
connect                         3.710 s
global_inits                   3893.714 s
solving                        7973.467 s
  S.Dom.equal                     0.423 s
  postsolver                     2942.472 s
warn_global                     0.451 s
  access                          0.413 s

Race Warnings:

Summary for all memory locations:
safe:         3359
vulnerable:    433
unsafe:       2091
-------------------
total:        5883

76 instances of [Warning][Unknown] Unknown function ptr called

sim642 commented 2 years ago

We seem to produce so many warnings that trying to write them out in json produces a stack overflow in List.map

We should use BatList.map whereever we have possibly large lists. That is implemented using constant stack (and without reversal!), unlike Stdlib.List.map, which is just naive.

michael-schwarz commented 2 years ago

The stackoverflow happens inside

      Yojson.Safe.pretty_to_channel ~std:true out json

(at https://github.com/goblint/analyzer/blob/36a5c5d20e75c06c979a46996c202ed7384ec0ef/src/framework/analyses.ml#L278). So just using BatList.map would not cut it as the call happens inside a library. But I guess this a problem for the future once we actually manage to analyze FFmpeg and get meaningful results.

sim642 commented 2 years ago

The stackoverflow happens inside
      Yojson.Safe.pretty_to_channel ~std:true out json
(at https://github.com/goblint/analyzer/blob/36a5c5d20e75c06c979a46996c202ed7384ec0ef/src/framework/analyses.ml#L278). So just using BateList.map would not cut it as the call happens inside a library.

Do you have a full stack trace of where inside the library? Might be worth reporting or fixing.

Other than that, maybe it's just specific to pretty_to_channel. Could be that the minimal output of to_channel doesn't suffer from the same issue.

michael-schwarz commented 2 years ago

I ran another run with the fix for non privatization and ctx insens base:

Timings:
TOTAL                          106.970 s
  parse                          47.924 s
  convert to CIL                 59.046 s
  analysis                        0.000 s
    makeCFG                         2.077 s
    connect                         4.217 s
    global_inits                   268.657 s
    solving                        99596.327 s
      S.Dom.equal                     5.553 s
      postsolver                     1454.500 s
    warn_global                     1.611 s
      access                          1.500 s
Timing used
Memory statistics: total=203083937.10MB, max=35359.30MB, minor=203055616.41MB, major=3504974.32MB, promoted=3476653.63MB
    minor collections=96826212  major collections=462 compactions=0

So we land at around 27h and 35GB of RAM here. We once again have the issue with the stackoverflow inside the json-messages output.

Another observation is that global inits is now a lot faster, presumably because of small local states?

Also what is a little disturbing is that despite all of this and the analysis now taking considerably longer, most code is still dead (it dead in fact not change at all):

Found dead code on 393400 lines (including 392979 in uncalled functions)!
Total lines (logical LoC): 410495

goblint / bench

FFmpeg #18