Open vannussina opened 8 months ago
Hi, thank you for the questions. First, real.txt
includes the starting locations of basic blocks of your fuzzing targets, while BBtargets.txt
is the code locations of your fuzzing targets. SelectFuzz uses real.txt
but not BBtargets.txt
. Second, we use genDistance.sh
to generate input information and run it before the second instrumentation. That is why "what is sense of compiling once more after genDistance.sh has been used", as we generate distance in the first compilation and instrument in the second compilation (this is also the standard method in AFLGo). Third, I didn't find a case where there is no genDistance.sh, but SelectFuzz can also run without distance instrumentation as its main core is selective instrumentation.
For your question that you cannot trigger a vulnerability in your sample program, there might be multiple reasons. Did you write testing scripts following those in SelectFuzz/scripts/fuzz; Have you installed all dependency (or used the docker images we provide)? You can also check the binary size after the second compilation, it should be a little bit larger than binary without instrumentation and smaller than binary instrumented in AFLGo.
Hi, thanks for your answer and explanations! I checked all again and it works now :-)
Just for helping me understand:
Why do you still include BBtargets.txt
in the compiler flags using -targets=$TMP_DIR/BBtargets.txt
if it is not used by SelectFuzz?
And I do not fully understand the definition of real.txt
. How am I supposed to know the starting locations of basic blocks of my fuzzing targets? In my small sample file I used code locations for real.txt
equivalent as I would have used for BBtargets.txt
and it works, but as my example is small and simple, it could be accidental. My actual project is way larger so I'd like to understand it correctly in order to make sure it works.
You can ignore the BBtargets.txt
since we do not use it.
About how to find the starting locations of basic blocks, you can check the definition of basic blocks first. A basic block is a sequence of instructions without any branches except at the beginning and end. After finding the basic blocks where the fuzzing targets are, you might set the starting code locations of the basic blocks in real.txt
.
I think I still don't fully understand the usage.
I just ran my fuzzing script using only BBtargets.txt
and still the instrumentation run printed for most files [+] Instrumented x locations
. Considering that AFL is used that makes sense, right? So where and how does real.txt
come in? I grepped the code for usage of a file named like that but could not find it.
You might grep "real" in /. I don't remember the exact location we applied it.
After adding and deleting real.txt in *.sh, did you observe any changes in the binary, such as the file size of the instrumented binary?
There's no change in binary size, whether I useBBtargets.txt
only, real.txt
only or both. Also fuzzing without real.txt
works, i.e. finds new paths and all.
I grepped for real.txt
in / and found a usage in a file called DFUZZPASS.cpp
which is used by the getDistance.sh
script.
So now I can understand the connection and the importance for the content of real.txt
for the calculation of target instructions. But then I still don't understand why it works with only BBtargets.txt
, too, as I could see in DFUZZPASS.cpp
that there's not much calculation on the targets from that file and all the important code is on the content of real.txt
?
Is the version of DFUZZPASS.cpp
in the docker container the final version used for libDFUZZ.so
that is used by getDistance.sh
? Because as far as I can see in the code,BBtargets.txt
should be necessary too:
if (!TargetsFile.empty()) {
if (OutDirectory.empty()) {
//FATAL("Provide output directory '-outdir <directory>'");
return false;
}
std::ifstream targetsfile(TargetsFile);
std::string line;
while (std::getline(targetsfile, line))
targets.push_back(line);
targetsfile.close();
std::ifstream realFile(OutDirectory + "/real.txt");
std::string line1;
while (std::getline(realFile, line1)) {
reals.push_back(line1);
//debug << "real: " << line1 << "\n";
}
realFile.close();
}
If there's no or an empty BBtarget.txt
the body of the if-condition where the reals
vector is populated is never executed.
So I assume this is an old version and the actual used library differs? This could explain my confusion.
Both real.txt
and BBtarget.txt
are opened, but if you check the code, BBtarget.txt
is not used later. Only contents in real.txt
are used. You can modify the code in /Adfuzz (which builds libDFUZZ.so
, the building scripts are also in the docker) to open and use only real.txt
. Or you can refer to the cases in *.sh.
To demonstrate the diff between real.txt
and BBtarget.txt
. Considering the following code "test.c":
int a, b, c;
if (a>10) {
b=2; //(line 3)
c=3; //fuzzing target (line 4)
}
In BBtarget.txt
, the content should be "test.c:4" (the fuzzing target); but in real.txt
, the content should be "test.c:3" as it is the starting line of basic block the target code is in.
@vannussina Hi. I have a question about the selectfuzz installation. Did you run this tool via manual installation? I followed the instructions in the repo page, but could not run the tool successfully (It seems that there is something wrong with afl-clang-fast). Currently the llvm I use is 4.0.0.
@5hEn918 Hi, I ended up running it in the provided Docker container in the beginning and then building my own Docker container because I didn't want to change proxy settings in environment variables and git with every container restart. I also could not get it to build on my local system, I guess there were some dependencies or paths missing because I had LLVM 4 installed alongside LLVM 11. But in the provided Docker container and also the one I built for myself it worked well, so I would recommend that :)
@5hEn918 Hi, I ended up running it in the provided Docker container in the beginning and then building my own Docker container because I didn't want to change proxy settings in environment variables and git with every container restart. I also could not get it to build on my local system, I guess there were some dependencies or paths missing because I had LLVM 4 installed alongside LLVM 11. But in the provided Docker container and also the one I built for myself it worked well, so I would recommend that :)
Actually, I could run the tool well via Docker. Maybe there were dependencies or something wrong with my local machine need to check. Thanks for your reply 👍
Hi, I want to use SelectFuzz to fuzz my project and currently I'm trying to get it running with a small C++ program to test the setup. Everything seems to be fine, i.e. there are no crashes or error messages, but the tool does not seem to find the crashes I manually inserted in the program. I've checked the distance files and they contain distance information, as do
BBcalls.txt
,BBnames.txt
,Ftargets.txt
and all the other files in the temp folder. Unfortunately, from the example fuzzing scripts it is not clear to me what steps are necessary to ensure successful fuzzing. What is the usage ofreal.txt
and why is it sometimes used equivalent toBBtargets.txt
, sometimes not? Why do you use thegenDistance.sh
script in some cases, but not in all? And what is sense of compiling once more aftergenDistance.sh
has been used, if the binary from the distance script (sometimes) is used for fuzzing anyways? Could you please clarify the necessary steps to successfully run SelectFuzz? That would help me a lot.