5hadowblad3 / Beacon_artifact

Research artifact for Oakland (S&P) 2022, "BEACON: Directed Grey-Box Fuzzing with Provable Path Pruning"
Apache License 2.0
32 stars 5 forks source link

Segfault #5

Closed vannussina closed 10 months ago

vannussina commented 11 months ago

Hi, when I used Titan for one of my fuzzing targets, the static analysis crashed with a segfault. As the precondInfer is inherited by Beacon, I'm posting this here. With some further investigation I found out that the crash appeared in file AbstractState.cpp, line 236 in function AbstractState::set:

if (valRange.empty() && memRange.empty()) {
    const Value *placeHodler = *valKeys.begin();
    valRng.insert({placeHodler, Interval::createTop(placeHodler->getType()->getIntegerBitWidth(), true)});
  }

gdb showed that variable placeHodler is a nullpointer, which is then dereferenced in the next line causing the segfault. The CPP reference states for cases as this that "If the container is empty, the returned iterator value shall not be dereferenced." which is the case here. I am not sure, why valKeys is empty though and if this might be an error in my configuration, but the other targets worked just fine so far.

My fuzzing target was flex with the following cstest.txt:

misc.c:106
skeletons.c:103
header_nr_main.c:34
bison_yylval_main.c:32
cxx_multiple_scanners_main.cc:46
dfa.c:147
no_bison_stub.c:36
bison_yylval_main.c:39
main.c:350
multiple_scanners_nr_main.c:42
mywc.c:11
header_r_main.c:65
bison_yylloc_main.c:28
bison_yylval_main.c:31
bison_yylloc_main.c:39
skeletons.c:108
libyywrap.c:27
tblcmp.c:597
tblcmp.c:652
sym.c:114
yylex.c:109
ccl.c:195
tables.c:409
mywc.c:14
multiple_scanners_nr_main.c:41
nfa.c:373
ccl.c:49
multiple_scanners_r_main.c:47
buf.c:61
buf.c:82
dfa.c:469
scanflags.c:66
bison_yylloc_main.c:32
bison_nr_main.c:31
header_r_main.c:53
scanflags.c:42
dfa.c:1037
tables_shared.c:68
scanflags.c:67
dfa.c:319
malloc.c:15
malloc.c:6
bison_yylval_main.c:27
tblcmp.c:465
filter.c:148
regex.c:140
no_bison_stub.c:26
main.c:1684
mywc.c:13
misc.c:270
libyywrap.c:26
dfa.c:536
gen.c:53
tables.c:225
skeletons.c:101
header_nr_main.c:35
bison_nr_main.c:30
gen.c:635
cxx_multiple_scanners_main.cc:38
mywc.c:17
nfa.c:602
scanflags.c:56
buf.c:91
top_main.c:53
regex.c:56
misc.c:255
tables_shared.c:65
misc.c:157
misc.c:330
multiple_scanners_r_main.c:48
ecs.c:145
malloc.c:16
tables.c:118
filter.c:153
sym.c:81
ccl.c:142
malloc.c:14
no_bison_stub.c:28
tblcmp.c:368
libyywrap.c:24
sym.c:132
misc.c:422
multiple_scanners_r_main.c:30
top_main.c:65
misc.c:610
regex.c:129
gen.c:179
sym.c:92
scanflags.c:48
gen.c:444
multiple_scanners_nr_main.c:43
cxx_multiple_scanners_main.cc:37
header_r_main.c:52
header_nr_main.c:30
tables.c:235
header_r_main.c:62
regex.c:51
scanopt.c:617
main.c:1459
header_r_main.c:33
libmain.c:26
buf.c:53
realloc.c:15
tblcmp.c:778
main.c:757
skeletons.c:223
misc.c:625
skeletons.c:143
yylex.c:48
misc.c:396
options.c:49
libmain.c:33
yylex.c:51
no_bison_stub.c:30
gen.c:956
options.c:186
tables.c:150
5hadowblad3 commented 11 months ago

Could you please provide the bc file and the related options you used for Titan?

vannussina commented 11 months ago

Sure! Basically this is part of my build script for the target until the static analysis:

# Generate bitcode file
echo -e "## Build by wllvm"
export CC="wllvm"
export CXX="wllvm++"
export CFLAGS="-g"
export CXXFLAGS="-g"
export LLVM_COMPILER=clang

pushd $SUBDIR
    ./autogen.sh
    ./configure --disable-shared
    make clean
    make
popd

extract-bc "$SUBDIR/src/flex"

# Build for Titan
echo "[+] Static Analysis"
$FUZZ/prototype/precondInfer "$SUBDIR/src/flex.bc" --target-file=$SEED/cstest.txt --join-bound=1 > "$OUT/log_precond.txt" 2>&1

The logfile shows that target extraction works fine until it dies in the first fixpoint computation:

Starting fixpoint computation for buf_prints
0

and then there's the segfault. This is the bc file: flex.zip

yiyuaner commented 10 months ago

@vannussina I am unable to reproduce using 1fb90fffa4402. The provided target file causes beacon to exit with malformed target file -- exiting!. Which version of the code are you testing? Thanks.

vannussina commented 10 months ago

The target file was for Titan, which supports multi-targeting. Beacon doesn't support multi-targets, so I suppose that's why it failed. I just posted this here because I analyzed Beacon's PrecondInfer as it's only in binary form in the Titan repo. In the mentioned Titan issue above I attached all files from the out folder as well for another target where I came across the issue.

yiyuaner commented 10 months ago

@vannussina f5489224ec39aed3 fixes the crash.

The problem is that our ICFG fails to identify any caller of the target function. This could either due to that the target is indeed unreachable or the function pointer analysis (based on SVF) is unsound. The latter case needs to be visited case by case. For instance, no callers for buf_prints are found by the static analysis. If you have identified a caller of it at run time, then we can also try to fix the call graph to make it more sound.

5hadowblad3 commented 10 months ago

@vannussina We will soon update the code for Titan, too.

yiyuaner commented 10 months ago

update: I have examine the bitcode flex.bc. It seems that buf_prints is indeed unreachable. I think this specific issue is fixed.

vannussina commented 10 months ago

Thanks! I'll check again with my targets as soon as the Titan repo is updated.

qhjchc commented 10 months ago

Thank you for bringing this matter to our attention:) We have made updates to Titan's code based on this issue, and the precondInfer now handles your bc file. If you have any further questions or encounter any issues, please feel free to reach out to us.

vannussina commented 10 months ago

Thanks for the quick fix! I checked and now it works! 👍

vannussina commented 10 months ago

Just one more quick question: I realized that the file bbreaches.txt now was renamed to bbreaches__path_to_cstest.txt. Is the file of any importance for the instrumentation or fuzzing process? Or ist ist just for debug purposes and I can ignore it?

qhjchc commented 10 months ago

@vannussina Yes. The filename can be ignored as it doesn't impact the functionality :)