AFLplusplus / AFLplusplus

The fuzzer afl++ is afl with community patches, qemu 5.1 upgrade, collision-free coverage, enhanced laf-intel & redqueen, AFLfast++ power schedules, MOpt mutators, unicorn_mode, and a lot more!
https://aflplus.plus
Apache License 2.0
5.26k stars 1.05k forks source link

Custom_mutator SymCC causes excessive file accumulation in output_dir. #2096

Open Isabel0715 opened 6 months ago

Isabel0715 commented 6 months ago

Describe the bug When using the custom_mutator symcc, a large number of files are generated in the data->out_dir folder during execution. I suspect this is due to the assignment of the done variable in the afl_custom_fuzz function in file AFLplusplus/custom_mutators/symcc/symcc.c. image

Each time scandir is called, it iterates through the elements under nl. If done == 0, it processes the file (Lines 292-306) and then unlinks it (Line 308). However, during execution, if afl_custom_fuzz finds 10,000 items in out_dir by scandir, only the file corresponding to nl[2] gets unlinked. This is because nl[0] and nl[1] usually have d_name values of "." and "..", respectively, which do not trigger the assignment done = 1. Once the file corresponding to nl[2] is accessed and processed, done is set to 1. Consequently, the subsequent 9997 files are not processed or unlinked due to done == 1, resulting in them not executing the logic from Lines 292-310.

This seems problematic as hundreds even thousands of files are added to out_dir every second, but the deletion rate is much slower, leading to a rapid increase in the number of files in out_dir. This quickly exceeds the file count limit allowed by my supercomputer account. I'm unsure if afl_custom_fuzz is intended to process and delete only one file per call. If so, is there a way to limit the number of files in out_dir?

I attempted to remove the done==0 check (Lines 290 and 310) and reran the code. In this case, all files were processed and unlinked quickly, keeping the file count in out_dir below 3000. However, I'm unsure about the original purpose of the done variable and whether this modification aligns with the design principles of AFL++ and custom_mutator. I really need your assistance.

To Reproduce Steps to reproduce the behavior:

  1. Install SymCC.
  2. Setup this custom_mutator to AFL++ by using make command in AFLplusplus/custom_mutators/symcc.
  3. Compile the target program with symcc.
  4. Set the environment variables SYMCC_TARGET and AFL_CUSTOM_MUTATOR_LIBRARY.
  5. Run afl-fuzz to test alongwith symcc.

Expected behavior Limit the number of files in symcc out_dir (The upper limit for me is up to 20000).

Screen output/Screenshots I modified the afl_custom_fuzz function by adding some logging statements as following.

size_t afl_custom_fuzz(my_mutator_t *data, uint8_t *buf, size_t buf_size,
                       u8 **out_buf, uint8_t *add_buf, size_t add_buf_size,
                       size_t max_size) {

  struct dirent **nl;
  int32_t         i, done = 0, items = scandir(data->out_dir, &nl, NULL, NULL);
  ssize_t         size = 0;

  if (items <= 0) return 0;
  ACTF("items: %d", items);
  for (i = 0; i < (u32)items; ++i) {
    ACTF("Iterating i: %d, d_name: %s", i, nl[i]->d_name);
    struct stat st;
    u8 *        fn = alloc_printf("%s/%s", data->out_dir, nl[i]->d_name);

    ACTF("done: %d", done);
    if (done == 0){

      ACTF("processing %s", nl[i]->d_name);
      if (stat(fn, &st) == 0 && S_ISREG(st.st_mode) && st.st_size) {

        int fd = open(fn, O_RDONLY);

        if (fd >= 0) {

          size = read(fd, data->mutator_buf, max_size);
          *out_buf = data->mutator_buf;

          close(fd);
          done = 1;

        }

      }
      ACTF("Try Unlink %s", nl[i]->d_name);
      int32_t unlink_status = unlink(fn);
      ACTF("Unlink Status %d", unlink_status);
    }

    ck_free(fn);
    free(nl[i]);

  }

  free(nl);
  DBG("FUZZ size=%lu\n", size);
  return (uint32_t)size;

}

And parts of the corresponding log are as follows. image

vanhauser-thc commented 6 months ago

oh you are right this is a bug. fixed it in the dev branch, thanks for reporting!

Isabel0715 commented 6 months ago

Thanks for your quick reply! However, I copyed the changes made in dev branch and reran the code, the bug was still there. It doesn't try to unlink nl[0] (".") and nl[1] ("..") anymore, but still only the nl[2] can be unlinked. I'm not sure this is the expected behavior. image

Isabel0715 commented 6 months ago

Is it correct that I suppose all the files except for . and .. should be processed and unlinked?

vanhauser-thc commented 6 months ago

no the _fuzz function only returns a single testcase input. that is why only one may be removed.

and now that you point it out - there was never a bug, so I reverted my "fix", because the previous state was better than what I did.

Isabel0715 commented 6 months ago

OK. So, are there any possible solutions to limit the number of files in out_dir? The upper limit for me is up to 20000, which is easy to exceed in ~3 hours.

vanhauser-thc commented 6 months ago

Then this is a different bug. Because the _fuzz function is as often called as files are in there - that is what the _count custom mutator function returns in the step before running the _fuzz loop

Isabel0715 commented 6 months ago

Thanks again for your explanation. However, I'm still a bit confused about how it is supposed to work exactly. I wonder if you are going to fix this.

vanhauser-thc commented 6 months ago

I will fix it - when I have the time