googleprojectzero / Jackalope

Binary, coverage-guided fuzzer for Windows, macOS, Linux and Android
Apache License 2.0
1.1k stars 128 forks source link

Adding extensions list #53

Closed 20urc3 closed 10 months ago

20urc3 commented 11 months ago

Hi. I'm trying to add the feature of passing a list of extension to the fuzzer instead of specifying only one. This would allow the fuzzer to use multiple extensions instead of one. Here is the code I got so far (without succeeding to run it properly). Could you provide any feedback?

    char *extension_list_opt = GetOption("-file_extension_list", argc, argv);
    if (extension_list_opt) {
    // Check file existence
    ifstream file(extension_list_opt);
    if (!file.is_open()) {
      throw std::runtime_error("Failed to open file");
    }

    // Read extensions from file
    vector<string> extensions;
    string line;
    while (getline(file, line)) {
      for (const char &c : line) {
        if (c == ';') {
          break;
      }

      // Parse extension index and value
      int index = atoi(line.c_str());
      extensions.push_back(extensions[index]);
      }
    }

    file.close();
    // Generate random index
    int extensionCount = extensions.size();
    int randomIndex = rand() % extensionCount;

    // Set extension based on random index
    extension = string(".") + string(extensions[randomIndex];
    }
ifratric commented 10 months ago

As a general recommendation, I usually recommend fuzzing different file formats separately. The exception for this would be for very similar format where discovering coverage in one format can somehow help with discovering coverage in another.

There is a way to do this with current Jacaklope, but it requires multiple instances: You start one fuzzing instance as a server, and then you start a client instance for every extension that connects to the same server.

If you wanted to add something like this to Jackalope, the biggest question is where to put it. If you just put it when the fuzzer initializes, then the extension gets randomly selected once and the same extension is used for the entire session. But if you change the extension for every iteration that also doesn't work well because e.g. Jackalope might try to e.g. verify coverage of a sample and, if the extension changes, so could the coverage. So the best place would probably be in or around Fuzzer::FuzzJob and store the current extension somewhere in ThreadContext, which means each thread could have a different extension and the same extension would be used for 1000 iterations or so (depending on the mutator settings). You would also need to kill the target process whenever the extension changes, otherwise the target and the fuzzer could end up with a different filename. And you'd also need to update the target arguments in ThreadContext whenever the extension changes, in addition to updating the filename used by FileSampleDelivery.

Otherwise, a couple of comments on the code itself:

20urc3 commented 10 months ago

Hi Ivan, make perfect sense. Thanks for your insight.! I'm closing it