goblint / bench

The benchmark suite
4 stars 5 forks source link

Other attempted projects #14

Open michael-schwarz opened 2 years ago

michael-schwarz commented 2 years ago
michael-schwarz commented 2 years ago

git.fixed-cil.zip

Edit: This once again is hodgepodge of different executables, so analyzing it is not really meaningful.

michael-schwarz commented 2 years ago

For FFmpeg with https://github.com/goblint/cil/pull/80/ and corresponding fix in Goblint, CIL succeeds to parse and combine everything. However, the analysis then gets stuck at:

Adding constructors to: main
And now...  the Goblin!
Startfuns: [main]
Exitfuns: []
Otherfuns: []
Using new format for phases!
Activated analyses for phase 0: expRelation, base, threadid, threadflag, threadreturn, escape, mutex, access, mallocWrapper, mhp
Activated transformations for phase 0: 
Generating the control flow graph.

using up >250GB of RAM. To me, it seems like there is something wrong in the CFG generation here then.

sim642 commented 2 years ago

To me, it seems like there is something wrong in the CFG generation here then.

It would be useful to make Goblint temporarily print the function name whose CFG is being constructed, so we can quickly find a minimal example I could look at.

michael-schwarz commented 2 years ago

The problem more is that we have Number of globals 510848. CFG generation does not freeze but become increasingly slow and RAM usage goes up to ridiculous numbers ~220GB before it also runs out of RAM on the server and dies.

sim642 commented 2 years ago

That's the number of globals, but surely they aren't most functions, but just variables.

michael-schwarz commented 2 years ago

Still, the numbers of functions is also at least >20k (at which point I aboated it)

An interesting observation is that the check whether everything is connected dominates the actual computation of CFGs by a factor of 200:

    makeCFG                         1.047 s
    connect                        171.697 s
michael-schwarz commented 2 years ago

@sim642 wrote:

The check whether everything is connected dominates the actual computation of CFGs by a factor fo 200:

    makeCFG                         1.047 s
    connect                        171.697 s

Also what program is this on where it has terminated?

sim642 commented 2 years ago

Since the_silver_searcher is on master in gobpie-demos, I looked at the races Goblint finds from it. At least half of them are simply because of assignments to unknown pointer, which through some fabulous logic distributes the write access to every field of every global. These might again be due to our missing realloc (https://github.com/goblint/analyzer/issues/701) handling. Given the "quality" of the_silver_searcher, I suspect there are actual races to be found as well, although there's also too much noise.

karoliineh commented 9 months ago

I tried to analyze the git repository, once again. I tried it with generating a compilation database with bear. As there are a lot of different main functions in the project, @sim642 suggested that one might try to compile the commands separately. Therefore I tried analyzing the following 3 commands:

Just for fun, the first thing that came to mind:

NO_OPENSSL=1 NO_CURL=1 bear -- make git-cherry-pick
goblint compile_commands.json

Actually has pthread_create:

NO_OPENSSL=1 NO_CURL=1 bear -- make -j 11 git-grep
goblint compile_commands.json

Maybe analyzing some of the "simplest" commands with the simplest conf won't run out of memory?:

NO_OPENSSL=1 NO_CURL=1 bear -- make -j 11 git-add
goblint compile_commands.json --conf /analyzer/conf/examples/large-program.json

However, in all of my attempted runs, Goblint got killed (most possibly due to out-of-memory). @sim642 suggested that maybe one could try and run it on a server with more resources.

michael-schwarz commented 9 months ago

If the sever is the bottleneck: If you write a script with the steps, I can run it on our sever with 512GB of RAM, though that is the biggest one we have.

sim642 commented 9 months ago

I tried this overnight on our server with 32GB RAM:

NO_OPENSSL=1 NO_CURL=1 bear -- make -j 15 git-grep
goblint --conf examples/large-program.json compile_commands.json 

It mysteriously died after 3h20min: no "Killed", exit code 0, just stopped with no explanation.

Today I tried it again with some profiling and it crashed after the same time, this time explicitly "Killed". So feel free to try it on your own but I don't have too high expectations. At the moment of death Goblint was at this point:

runtime: 03:20:37.661
vars: 7080440, evals: 29235491

|rho|=7080440
|stable|=6572163
|infl|=7078192
|wpoint|=74592
|sides|=424929
|side_dep|=0
|side_infl|=0
|var_messages|=0
|rho_write|=0
|dep|=6226162
|called|=228
Found 415057 contexts for 1822 functions. Top 5 functions:
29556   contexts for L:entry state of vsnprintf (5042244) on /usr/include/x86_64-linux-gnu/bits/stdio2.h:81:1-87:1
24175   contexts for L:entry state of strbuf_vaddf (11685) on git/strbuf.c:393:1-412:1
18211   contexts for L:entry state of memcpy (5042287) on /usr/include/x86_64-linux-gnu/bits/string_fortified.h:25:1-31:1
16294   contexts for L:entry state of append_quoted_string (5089741) on git/json-writer.c:19:1-45:1
12448   contexts for L:entry state of jw_object_string (2135970) on git/json-writer.c:158:1-162:1

Memory statistics: total=21971776.09MB, max=35359.31MB, minor=21933689.68MB, major=401256.95MB, promoted=363170.54MB
    minor collections=10459247  major collections=121 compactions=0

Even with the large-program conf which has almost no analyses and nothing but pointers in contexts, there are still tens of thousands of contexts for some functions. This is with earlyglobs where the globals aren't in the local states/contexts either. According to the profiling, over 1h of the runtime was just spent on some silly die function (which can recurse!). In fact, there seems to be a lot of recursion in git. I also tried with --enable ana.context.widen and it seemed to make things even worse by never even getting out of analyzing some very first initialization function. There must be some stupidity happening in Goblint if the majority of time is spent analyzing error cases where git dies instead of the actual meat of git.

sim642 commented 9 months ago

I moved the discussion about git to a separate issue (#62) and hid those comments from here.