Open riesentoaster opened 2 weeks ago
I don't fully understand: The OnDiskCorpus will contain the "content of the testcase/input that triggered the inputs"- that's what it's for, right?
That being said, currently the correct(tm) way to add metadata to a Testcase is via custom Feedbacks that do nothing like here: https://github.com/AFLplusplus/LibAFL/blob/e370e2f852b28aa0c4baedff426005429dbb6c08/libafl/src/feedbacks/stdio.rs#L107
Yes, the corpus will contain everything, of course. But it isn't written to disk, so when I kill the fuzzer, I lose everything but the metadata (found in the .metadata
file). And that doesn't per default contain the input that triggered a crash (or whatever you're looking for). So I can't reproduce the crash.
Why is the _ OnDisk_Corpus not written to disk? What crash are you talking about? A crash in the fuzzer or a crash in the target? Crashes in the target are of course included in the corpus (if you have a CrashFeedback)? Sorry, I'm confused...
Ah, I see, seems like I missed something. If I understand correctly, the input content is serialised and written to disk in this method on Input
, to the file associated with the crash without an extension or a leading dot:
/// Write this input to the file
fn to_file<P>(&self, path: P) -> Result<(), Error>
where
P: AsRef<Path>,
{
write_file_atomic(path, &postcard::to_allocvec(self)?)
}
When initialising the corpus, a format can be passed, and while this leaves the metadata nicely formatted, the input itself is still serialised and thus not human readable.
OnDiskCorpus::with_meta_format(
PathBuf::from("./crashes"),
OnDiskMetadataFormat::JsonPretty,
)
.unwrap(),
So I guess I'm asking for an option for human-readable serialisation of the input when written to disk.
I guess I could also just implement this for my input, so a global option may not be strictly necessary, but it would still be nice, just for consistency.
Related question: All input types in the repo (at least as far as I can see) generate their testcase names (fn generate_name(&self, id: Option<CorpusId>) -> String;
on Input
) the exact same way: hash their content (for collection types, namely Vec
s, this is done manually for some reason) and take the first 16 bytes.
Should there not just be a blanket implementation that does this for any input that implements Hash
(or where this is derived)?
For a human-readable serialization there is the DumpToDiskStage
that goes through new inputs and serializes them with a provided closure.
Is this what you are looking for?
Yes, this kind of does what I would want it to do, but
/dev/null
)OnDiskMetadataFormat::JsonPretty
)Depending on how large your corpus gets and the change-rate within it, the first point may annoying to a considerable downside. The second is not critical, just a bit of extra code, would just be easier without it :)
Plus I would expect this kind of functionality in the corpus, especially OnDiskCorpus
, not in a stage — that's probably also why I haven't found this.
Feel free to fix the first point :) For the second point, we could have a number of serialiser functions in LibAFL, right?
Open for other suggestions of course.
Most fuzzers will likely use some form of
OnDiskCorpus
(incl.InMemoryOnDiskCorpus
,CachedOnDiskCorpus
, etc.) for their solutions. To then figure out, what the problem actually was, one would need to know the content of the testcase/input that triggered the feedbacks. Currently, corpora storing them on disk store a bunch of generic information in the file associated with the testcase/input (such as runtime), but no representation of the input.The only way to do add this without resorting to writing dummy-feedbacks that do nothing but add a new metadata with the input content, is by implementing the filename generating function on the input to extract the testcase from the corpus, and somehow stringify it:
However, file names have a length restriction, so this isn't usable for inputs that can get somewhat long. Plus, for structured inputs, it would be much easier to have the entire structure nicely formatted in the file.