Open dergoegge opened 2 months ago
I tried reverting the multi processing changes for profile accumulation, since that involves storing all profiles in memory multiple times.
With that, fuzz-introspector still runs out of memory but at a later stage during the creation of the html report (i think?):
2024-09-30 15:30:44.639 INFO debug_info - extract_all_functions_in_debug_info: Extracting functions
2024-09-30 15:30:45.420 INFO debug_info - extract_all_functions_in_debug_info: Extracting functions
2024-09-30 15:30:48.702 INFO debug_info - extract_all_functions_in_debug_info: Extracting functions
2024-09-30 15:30:49.420 INFO debug_info - extract_all_functions_in_debug_info: Extracting functions
2024-09-30 15:30:52.701 INFO debug_info - extract_all_functions_in_debug_info: Extracting functions
2024-09-30 15:30:52.969 INFO debug_info - load_debug_all_yaml_files: Set base loader to use CSafeLoader
2024-09-30 17:53:40.388 INFO debug_info - load_debug_all_yaml_files: Set base loader to use CSafeLoader
/usr/local/bin/compile: line 333: 20421 Killed python3 /fuzz-introspector/src/main.py report $REPORT_ARGS
ERROR:__main__:Building fuzzers failed.
Building project failed
From dmesg:
[Mon Sep 30 19:43:29 2024] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/docker-fcf7e99325ff30022b7afc78c1455e2f3d9244091aae82afc2dceace340e0b05.scope,task=python3,pid=3224215,uid=0
[Mon Sep 30 19:43:29 2024] Out of memory: Killed process 3224215 (python3) total-vm:130907700kB, anon-rss:129127460kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:253660kB oom_score_adj:0
[Mon Sep 30 19:43:32 2024] oom_reaper: reaped process 3224215 (python3), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
apologies for the delay here @dergoegge -- I'm looking at reducing the memory overhead!
Context
Currently, Bitcoin Core has ~200 fuzz harnesses which are compiled into one big binary
fuzz
. Selecting a harness for fuzzing is done through setting theFUZZ
environment variable, e.g.FUZZ=minisketch ./fuzz
. Internally this works by having a global map, mapping harness names to harness functions, and looking up the requested harness at startup.In oss-fuzz we hack around this (since oss-fuzz requires individual binaries per harness), by copying the original fuzz binary for all our harnesses and manually editing them to hardcode a specific harness to be fuzzed (see https://github.com/google/oss-fuzz/blob/173cae95c4d1810ee56484a8e63e0cf6f990afeb/projects/bitcoin-core/build.sh#L83-L106). In the end, each of the binaries still includes a hardcoded runtime lookup of a harness function.
Our understanding is that these runtime lookups prevent introspector from statically determining which code is reachable by a given harness. This is reflected by the introspector report produced by oss-fuzz: https://storage.googleapis.com/oss-fuzz-introspector/bitcoin-core/inspector-report/20240911/fuzz_report.html. It only detects one harness and can't analyze it properly (e.g. the call tree: https://storage.googleapis.com/oss-fuzz-introspector/bitcoin-core/inspector-report/20240911/calltree_view_0.html).
Issue
I've started work on splitting up Bitcoin Core's monolithic
fuzz
binary into individual binaries that no longer include the runtime lookups: https://github.com/bitcoin/bitcoin/pull/30882, so that we can properly start using introspector. The problem now is that introspector runs out of memory while generating the report (the machine has 128GB of ram).I've been using this branch from my personal oss-fuzz fork and invoke the introspector build with
python3 infra/helper.py introspector bitcoin-core --seconds 5
(this uses the corpora available in the image and also fuzzes for an additional 5 sec).From dmesg:
I've tried limiting the whole process to one CPU core, in the hope that avoiding parallelism also avoids excessive memory use but that didn't work out.