Essential steps for fuzzing

vannussina commented 8 months ago

Hi, I want to use SelectFuzz to fuzz my project and currently I'm trying to get it running with a small C++ program to test the setup. Everything seems to be fine, i.e. there are no crashes or error messages, but the tool does not seem to find the crashes I manually inserted in the program. I've checked the distance files and they contain distance information, as do BBcalls.txt, BBnames.txt, Ftargets.txt and all the other files in the temp folder. Unfortunately, from the example fuzzing scripts it is not clear to me what steps are necessary to ensure successful fuzzing. What is the usage of real.txt and why is it sometimes used equivalent to BBtargets.txt, sometimes not? Why do you use the genDistance.sh script in some cases, but not in all? And what is sense of compiling once more after genDistance.sh has been used, if the binary from the distance script (sometimes) is used for fuzzing anyways? Could you please clarify the necessary steps to successfully run SelectFuzz? That would help me a lot.

chluo1997 commented 8 months ago

Hi, thank you for the questions. First, real.txt includes the starting locations of basic blocks of your fuzzing targets, while BBtargets.txt is the code locations of your fuzzing targets. SelectFuzz uses real.txt but not BBtargets.txt. Second, we use genDistance.sh to generate input information and run it before the second instrumentation. That is why "what is sense of compiling once more after genDistance.sh has been used", as we generate distance in the first compilation and instrument in the second compilation (this is also the standard method in AFLGo). Third, I didn't find a case where there is no genDistance.sh, but SelectFuzz can also run without distance instrumentation as its main core is selective instrumentation.

For your question that you cannot trigger a vulnerability in your sample program, there might be multiple reasons. Did you write testing scripts following those in SelectFuzz/scripts/fuzz; Have you installed all dependency (or used the docker images we provide)? You can also check the binary size after the second compilation, it should be a little bit larger than binary without instrumentation and smaller than binary instrumented in AFLGo.

vannussina commented 8 months ago

Hi, thanks for your answer and explanations! I checked all again and it works now :-)

Just for helping me understand: Why do you still include BBtargets.txt in the compiler flags using -targets=$TMP_DIR/BBtargets.txt if it is not used by SelectFuzz?

And I do not fully understand the definition of real.txt. How am I supposed to know the starting locations of basic blocks of my fuzzing targets? In my small sample file I used code locations for real.txt equivalent as I would have used for BBtargets.txt and it works, but as my example is small and simple, it could be accidental. My actual project is way larger so I'd like to understand it correctly in order to make sure it works.

chluo1997 commented 8 months ago

You can ignore the BBtargets.txt since we do not use it.

About how to find the starting locations of basic blocks, you can check the definition of basic blocks first. A basic block is a sequence of instructions without any branches except at the beginning and end. After finding the basic blocks where the fuzzing targets are, you might set the starting code locations of the basic blocks in real.txt.

vannussina commented 6 months ago

I think I still don't fully understand the usage. I just ran my fuzzing script using only BBtargets.txt and still the instrumentation run printed for most files [+] Instrumented x locations. Considering that AFL is used that makes sense, right? So where and how does real.txtcome in? I grepped the code for usage of a file named like that but could not find it.

chluo1997 commented 6 months ago

You might grep "real" in /. I don't remember the exact location we applied it.

After adding and deleting real.txt in *.sh, did you observe any changes in the binary, such as the file size of the instrumented binary?

vannussina commented 6 months ago

There's no change in binary size, whether I useBBtargets.txt only, real.txt only or both. Also fuzzing without real.txt works, i.e. finds new paths and all. I grepped for real.txt in / and found a usage in a file called DFUZZPASS.cpp which is used by the getDistance.sh script. So now I can understand the connection and the importance for the content of real.txt for the calculation of target instructions. But then I still don't understand why it works with only BBtargets.txt, too, as I could see in DFUZZPASS.cpp that there's not much calculation on the targets from that file and all the important code is on the content of real.txt?

vannussina commented 6 months ago

Is the version of DFUZZPASS.cpp in the docker container the final version used for libDFUZZ.so that is used by getDistance.sh? Because as far as I can see in the code,BBtargets.txt should be necessary too:

      if (!TargetsFile.empty()) {

        if (OutDirectory.empty()) {
          //FATAL("Provide output directory '-outdir <directory>'");
          return false;
        }

        std::ifstream targetsfile(TargetsFile);
        std::string line;
        while (std::getline(targetsfile, line))
          targets.push_back(line);
        targetsfile.close();

        std::ifstream realFile(OutDirectory + "/real.txt");
        std::string line1;
        while (std::getline(realFile, line1)) {
          reals.push_back(line1);
          //debug << "real: " << line1 << "\n";
        }
        realFile.close();
      }

If there's no or an empty BBtarget.txt the body of the if-condition where the reals vector is populated is never executed. So I assume this is an old version and the actual used library differs? This could explain my confusion.

chluo1997 commented 6 months ago

Both real.txt and BBtarget.txt are opened, but if you check the code, BBtarget.txt is not used later. Only contents in real.txt are used. You can modify the code in /Adfuzz (which builds libDFUZZ.so, the building scripts are also in the docker) to open and use only real.txt. Or you can refer to the cases in *.sh.

To demonstrate the diff between real.txt and BBtarget.txt. Considering the following code "test.c":

int a, b, c;
if (a>10) {
   b=2; //(line 3)
   c=3; //fuzzing target (line 4)
}

In BBtarget.txt, the content should be "test.c:4" (the fuzzing target); but in real.txt, the content should be "test.c:3" as it is the starting line of basic block the target code is in.

5hEn918 commented 4 months ago

@vannussina Hi. I have a question about the selectfuzz installation. Did you run this tool via manual installation? I followed the instructions in the repo page, but could not run the tool successfully (It seems that there is something wrong with afl-clang-fast). Currently the llvm I use is 4.0.0.

vannussina commented 3 months ago

@5hEn918 Hi, I ended up running it in the provided Docker container in the beginning and then building my own Docker container because I didn't want to change proxy settings in environment variables and git with every container restart. I also could not get it to build on my local system, I guess there were some dependencies or paths missing because I had LLVM 4 installed alongside LLVM 11. But in the provided Docker container and also the one I built for myself it worked well, so I would recommend that :)

5hEn918 commented 3 months ago

@5hEn918 Hi, I ended up running it in the provided Docker container in the beginning and then building my own Docker container because I didn't want to change proxy settings in environment variables and git with every container restart. I also could not get it to build on my local system, I guess there were some dependencies or paths missing because I had LLVM 4 installed alongside LLVM 11. But in the provided Docker container and also the one I built for myself it worked well, so I would recommend that :)

Actually, I could run the tool well via Docker. Maybe there were dependencies or something wrong with my local machine need to check. Thanks for your reply 👍

cuhk-seclab / SelectFuzz

Essential steps for fuzzing #5