Closed tsytsarkin closed 1 year ago
Thanks for looking into Jackalope's code.
However, I believe in this case this is working as intended.
Client class does not save samples into the sample queue but rather to a different location (server_cache
directory, see https://github.com/googleprojectzero/Jackalope/blob/main/client.cpp#L31) and for a different purpose. This only happens when keep_samples_in_memory is unset (typically used when samples are large and you don't want to keep them in memory). In that case, when the Client gets samples from Server, rather than keeping them in memory until they get picked up by a worker thread, Client drops the samples to disk to server_cache
directory to free the memory. Note also that num_samples
from the Client
class is not the same as num_samples
in the Fuzzer
class. Thus sample saving in Client
and in Fuzzer
should not interfere in any way since these are separate mechanisms with separate directories and variables.
You are correct! Thank you for the detailed explanation.
There is a bug in how downloading samples from a server is implemented.
First of all, saving the sample in client.cpp is redundant. The new samples received from the server are being returned back to fuzzer.cpp in server_samples. Then process sample jobs are created here: https://github.com/googleprojectzero/Jackalope/blob/69c4eee961102593cdf75bd81a7c99da805266e3/fuzzer.cpp#L666-L674
Then, Fuzzer::ProcessSample calls RunSample which analyzes the coverage, and saves the sample if it produces new coverage. (Fuzzer::ProcessSample saves the sample if it does not produce new coverage but add_all_inputs is specified) https://github.com/googleprojectzero/Jackalope/blob/69c4eee961102593cdf75bd81a7c99da805266e3/fuzzer.cpp#L766-L777
RunSample saves the sample here: https://github.com/googleprojectzero/Jackalope/blob/69c4eee961102593cdf75bd81a7c99da805266e3/fuzzer.cpp#L455-L469
Therefore, we don't need to save samples in client.cpp.
Not only this code is redundant, but there is also a serious problem which leads to samples being overwritten on disk. In both cases,
num_samples
is being used to generate the file name. But those are 2 different private variables in CoverageClient and Fuzzer classes. Those values are not synchronized. Fuzzer increments that variable when a new sample that improves the coverage is found, while CoverageClient increments it when it receives new coverage from the server. In both cases, the files are saved to the same directory.Imagine the fuzzer finds a sample generating new coverage and saves it as
sample_00001
. Next, the server reports new coverage, and the client downloads the new sample from the server, which it also saves assample_00001
, overwriting the initial file. Then the client processes the new sample and (if it generates new coverage) saves the same sample assample_00002
.Note, this change will make previous saved client states incompatible, due to
num_samples
being removed from the save file. It is probably not a big deal, but I can add code to handle that case if we want to keep it.