Closed riesentoaster closed 1 month ago
can you do this by,
1) have two Mutations
; Mutation A for mutating the first part of your input, and Mutation B for mutating the second part of your input.
2) then define your own mutations that returns tuple_list!(MutationA::new(), MutationB::new())
Theoretically yes but practically no. I would like to have the system mutate both parts of the input in all the ways the havoc mutators would. So yes, I can define two mutators, one for each part, but they'd have to wrap the logic of all the havoc mutators. This would lead to lots of code duplication, since I'd essentially have to copy all the havoc mutators' logic into the mutate function of these two new mutators and select randomly which one to use.
Plus it may be better to have one large list of mutators for the mutation scheduler to work better instead of having only two which consist of randomness opaque to the mutation scheduler.
ok. i understand. but maybe the thing to change is not the havoc_mutation. because havoc_mutations is just a function to return the tuples of all the predefined list of mutators.
i think what you want is a mutator that takes closure to extract ranges (or some parts of the input), and do the mutations. then you put plug this mutator with the existing havoc_mutation
then you put plug this mutator with the existing havoc_mutation
That's exactly the issue. How would I do this? The existing havoc_mutations don't seem to have an interface for dynamic insertion of data.
i still don't get what you want to achieve. why do you need to dynamically insert into havoc_mutations to do this?
Alright, let me start from the beginning. I have an struct that looks as follows (with only two parts for simplicity's sake, but this should work with n parts in the end):
struct MyComplexCustomInput {
part1: Vec<u8>,
part2: Vec<u8>,
...
}
I want to end up with a system where the operations defined in the havoc_mutations
are done on part1
and part2
. havoc_mutations
extracts the byte slice from the input with the methods defined in this trait:
/// Contains an internal bytes Vector
pub trait HasBytesVec {
/// The internal bytes map
fn bytes(&self) -> &[u8];
/// The internal bytes map (as mutable borrow)
fn bytes_mut(&mut self) -> &mut Vec<u8>;
}
This is no issue if my Input has only one Vec<u8>
, since I can just write an implementation as follows:
struct MySimpleCustomInput {
part1: Vec<u8>,
}
impl HasBytesVec for MySimpleCustomInput {
fn bytes(&self) -> &[u8] {
&self.part1
}
fn bytes_mut(&mut self) -> &mut Vec<u8> {
&mut self.part1
}
}
...
let mut stages = tuple_list!(StdMutationalStage::new(StdScheduledMutator::new(havoc_mutations())));
...
This works. It mutates part1
as you'd expect. How would I implement this with MyComplexCustomInput
?
My proposal would be to add something like this:
fn havoc_mutations_on_input_part<I>(extractor: FnMut(I) -> &mut Vec<u8>) -> HavocMutationsType<I> {
BitFlipMutator::new(extractor),
ByteFlipMutator::new(extractor),
...
}
fn havoc_mutations<I>() -> HavocMutationsType<I> where I: HasBytesVec {
havoc_mutations_on_input_part(|input: I| input.bytes_mut())
}
let mut stages = tuple_list!(StdMutationalStage::new(StdScheduledMutator::new(
havoc_mutations_on_input_part(|input| &mut input.part1)
.merge(havoc_mutations_on_input_part(|input| &mut input.part2))
)));
// stages now contains mutators to mutate part1 and part2 in all the ways havoc_mutations would
I might just fundamentally miss something. If there is a simpler solution, please let me know!
so what you actually wants to do is to make each of the Mutation
mutate only user-specified parts of the entire bytesvec?
first, there are 3 things
now, can you not just extract the parts you want in the Mutator, make it as the Input
, then pass it to Mutation then copy it back? then this way it won't make the whole input
For example; https://github.com/AFLplusplus/LibAFL/blob/main/libafl/src/mutators/scheduled.rs#L100 this scheduled_mutate function is where these mutations are actually used.
can you do like this to implement your goal?
let mut r = MutationResult::Skipped;
let num = self.iterations(state, input);
for _ in 0..num {
let idx = self.schedule(state, input);
let extracted = self.extract(input); // extract part 1 or part 2 of your input
let outcome = self.mutations_mut().get_and_mutate(idx, state, extracted )?;
input.copy_back(extracted) // copy the mutated part back.
if outcome == MutationResult::Mutated {
r = MutationResult::Mutated;
}
}
I want to find a solution that is transparent to the wrapping mutation scheduler, so that it has full control over all mutators on all parts of the input (i.e. no further random selection opaque to the main scheduler).
I've got an idea, will report back once I know more. Thank you for your ideas so far, they really helped!
Alright, I've implemented what I think you suggested:
use std::borrow::Cow;
use libafl::{
inputs::{BytesInput, HasBytesVec, Input, UsesInput},
mutators::{MutationResult, Mutator},
Error,
};
use libafl_bolts::Named;
pub struct MappingMutator<IO, S> {
extractor: for<'a> fn(&'a mut IO) -> &'a mut Vec<u8>,
injector: for<'b> fn(&'b mut IO, &'b [u8]),
inner: Box<dyn Mutator<BytesInput, S>>,
}
impl<IO, S> MappingMutator<IO, S> {
pub fn new(
extractor: for<'a> fn(&'a mut IO) -> &'a mut Vec<u8>,
injector: for<'b> fn(&'b mut IO, &'b [u8]),
inner: Box<dyn Mutator<BytesInput, S>>,
) -> Self {
Self {
extractor,
injector,
inner,
}
}
}
impl<S, IO> Mutator<IO, S> for MappingMutator<IO, S>
where
S: UsesInput<Input = IO>,
IO: Input,
{
fn mutate(&mut self, state: &mut S, input: &mut IO) -> Result<MutationResult, Error> {
let extracted = (self.extractor)(input).clone();
let mut mapped_input = BytesInput::new(extracted);
let res = self.inner.mutate(state, &mut mapped_input);
(self.injector)(input, mapped_input.bytes());
res
}
}
impl<IO, S> Named for MappingMutator<IO, S> {
fn name(&self) -> &Cow<'static, str> {
&Cow::Borrowed("MappingMutator")
}
}
Used like this:
let mut stages = tuple_list!(StdMutationalStage::new(StdScheduledMutator::new(
tuple_list!(MappingMutator::new(
|input: &mut Base64Input| &mut input.raw_data,
|input, bytes| { input.raw_data = bytes.to_vec() },
Box::new(BitFlipMutator::new())
))
)));
However, I really don't like the copying the vector back and forth. But I also couldn't get this to work with just passing around references, no matter how I implemented a custom Input
type to use instead of the default BytesInput
, which needs an owned vec.
So:
havoc_mutations
to map MappingMutator
s over each, without having to do it manually?Is this what you suggested?
yes
Would you know if (and if yes: how) it is possible to implement something like this without the copying?
you can make each of the Mutation
take a range to select the stuff from, and then apply the mutation. but i wouldn't make it default.
How would I loop over the tuple defined in havoc_mutations to map MappingMutators over each, without having to do it manually?
i think you have to do it manually. (i'd ask chatgpt4/claude to write it for me)
@addisoncrump you have an idea?
also @domenukk
these Mutation
are kept simple and do the minimal things for a reason.
a testcase is scheduled to mutate to ~1000 times each round.
then one mutating involves ~128 stacked mutations.
the code in Mutation
is hot path. if you want to apply it for your own need, you should copy paste all and edit it.
i don't want to make these take extra parameters to do complicate stuff. (such as, selecting the ranges to mutate.. etc)
For the copying part, I'm working on a BytesSubMutator
that will allow us to use mutations on part of the input.
For the rest I don't have good answers atm :)
you mean BytesSubMutation
?
BytesSubInput, sorry. It can be used with any input then
See #2220 for BytesSubMutator
that will allow this without extra copies
Alright, I've had another play around. And I'm pretty sure what I'm trying to do is still not possible without reimplementing all mutators in havoc_mutations
. I cannot reasonably implement HasMutatorBytes
havoc_mutations
run on it directly, because the input consists of multiple parts and concatenating the parts to operate on is inefficient because it involves copying data around on each accessMultipartInput
, since not all parts actually have content and there is no way to implement it on content-less parts without keeping a dummy Vec<u8>
for those, which unnecessarily occupies memory and results in mutations that do nothing.I'm still thinking that a wrapper mutator, which extracts the specified part of the Input
and passes it to the inner mutator might be the only reasonable way to solve this issue. For this, I'd need something similar to BytesSubInput
, but without the requirement of HasMutatorBytes
on the outer Input
, instead taking a mapping closure.
Am I still missing something?
I don't fully understand why you want a mapping closure? Just implement the trait for your input - I mean, you need your input to implement HasMutatorBytes
to do havoc mutations, anyway - at least the ones that shrink or grow the input(?)
&mut Vec.into()
already implements HasMutatorBytes
, for example, so if you have a vec you can mutate it (and sub vecs with .sub_input()
.
You will need to keep track of the inputs, of course, so you'll need to write your own code, similar to MultipartInput
.
Other option would be to add the option to skip parts of the input to MultipartInput mutators?
I feel like an example would help (IIRC there was an example in the PR?)
I'm sorry for the confusion. I'll try to summarise my requirements for this custom Input
type again.
Conceptually, I'm doing differential fuzzing on two implementations of the same command line utility (in my case GNU coreutils vs. uutils coreutils). I want to ensure they behave the same, no matter what combination of command line arguments they are presented with. Manual testing could look something like this (from #2220):
echo "This is a stdin byte array" | binary_under_test --arg1_with_param "value of param for arg1" --arg2_without_param --arg3_without_param --arg4_with_param "some value"
echo "This is another byte array for stdin" | binary_under_test --arg1_with_param "mutated value of param for arg1"
echo "More stdin" | binary_under_test --arg2_without_param --arg3_without_param
I have a few types of parts that I need to have between 0
and n
, depending on the specific program I'm testing (examples on base64
):
base64
), which I think I'd model as a Vec<u8>
--help
)--wrap=COLS
), which I'd model either as a Option<Vec<u8>>
or a struct OptionalArg { data: Vec<u8>, active: bool }
I would like to mutate this custom Input
type in a few ways:
Mutator
, that's not the problem&mut Vec.into() already implements HasMutatorBytes
Hey, that's what I needed!
What I've currently got:
use std::borrow::Cow;
use libafl::{
inputs::{MutVecInput, UsesInput},
mutators::{MutationResult, Mutator},
Error,
};
use libafl_bolts::Named;
pub struct MappingMutator<S, M>
where
S: UsesInput,
for<'a> M: Mutator<MutVecInput<'a>, S>,
{
extractor: for<'a> fn(&'a mut S::Input) -> &'a mut Vec<u8>,
inner: M,
}
impl<S, M> MappingMutator<S, M>
where
S: UsesInput,
for<'a> M: Mutator<MutVecInput<'a>, S>,
{
pub fn new(extractor: for<'a> fn(&'a mut S::Input) -> &'a mut Vec<u8>, inner: M) -> Self {
Self { extractor, inner }
}
}
impl<S, M> Mutator<S::Input, S> for MappingMutator<S, M>
where
S: UsesInput,
for<'a> M: Mutator<MutVecInput<'a>, S>,
{
fn mutate(&mut self, state: &mut S, input: &mut S::Input) -> Result<MutationResult, Error> {
let mut mut_vec_input: MutVecInput = (self.extractor)(input).into();
self.inner.mutate(state, &mut mut_vec_input)
}
}
impl<S, M> Named for MappingMutator<S, M>
where
S: UsesInput,
for<'a> M: Mutator<MutVecInput<'a>, S>,
{
fn name(&self) -> &Cow<'static, str> {
&Cow::Borrowed("MappingMutator")
}
}
which is then used like this:
let mut stages = tuple_list!(StdMutationalStage::new(StdScheduledMutator::new(
tuple_list!(
MappingMutator::new(
|i: &mut Base64Input| i.extract_stdin(),
BitFlipMutator::new()
),
Base64FlipDecodeMutator,
Base64FlipIgnoreGarbageMutator,
Base64WrapContentMutator,
Base64FlipWrapMutator,
)
)));
The only thing I need now is a way to map this new MappingMutator
over all Mutators
in havoc_mutations
(which is a tuple_list
). Is there a way to do that without defining the entire list by hand?
(Even better, it's very easy to write something similar for optional args, as in where the extractor returns an Option<Vec<u8>>
)
I have just copy-pasted the entire list of havoc_mutations
in my project for now. There is an issue with CrossoverInsertMutator
and CrossoverReplaceMutator
, since they expect the Input
in the State
to implement HasMutatorBytes
, which is no longer the case if I'm mapping the inputs in my custom Mutator
s. That makes sense, they're just loading an Input
from the State
to insert/replace random parts from. I don't think there's a solution without changes to the implementations.
Happy you're getting somewhere! :)
Yes, splicing on your custom type cannot "just work", you could for example wrap these mutators with your own that grabs a parameter from the other input and just splices that part (or something). Or just levae them all. But all in all, splicing makes a lot of sense / finds new things
/edit: BTW: you can do havoc_mutations.into_vec()
to get a vector that you can then use for whatever you want, maybe that helps?
I will try to at some point create a PR with all this logic, with an interface which should make it very easy to mutate byte vector parts of custom inputs. I'd maybe also include a baby fuzzer example with a simple custom Input
, to explain what is happening. I think I'll just reimplement the two missing Mutators, this way I'll at least only have to duplicate two of the however many in havoc_mutations()
. But first, I've got to write a report before the semester ends. #studentlife
Would you be interested in such a PR?
Definitely sounds useful! The only question that remains (probably for @addisoncrump ) is if this cannot already be done with Multipart inputs, it still feels very closely related..
@addisoncrump If you can't be bothered to read the entire thread, here's the gist: Requirements and Problem
It is definitely possible to perform a mapping operation as you describe across a tuple. I'll implement the machinery for this real quick.
You will need to implement a custom version of crossover. That's just gotta happen, sorry!
Multipart is almost equipped to handle this, except for the no argument case. To handle this, I propose the following PR:
MutatorMapper
which accepts an input and returns a Option<Self::Output>
, where Output
is a associated type on MutatorMapper
. If the return is None
, we issue MutationResult::Skipped
to allow for fallible transformations.MutatorMapper
for Fn(T) -> Option<U>
.Then, we can implement what you describe here very easily with MultipartInput<ArgComponent>
where ArgComponent
is:
enum ArgComponent {
Arg(Arg),
Stdin(Vec<u8>)
}
enum Arg {
Help(Option<Vec<u8>>),
...
}
And then the mapper would simply transform the stdin or Arg to a mutable slice as needed.
It is definitely possible to perform a mapping operation as you describe across a tuple. I'll implement the machinery for this real quick.
Awesome, thank you!
You will need to implement a custom version of crossover. That's just gotta happen, sorry!
Fair enough.
For your proposal: I'm not sure where HasMutatorBytes
would come in, since this is what the mutators in havoc_mutations()
relies on?
I'm also not sure if the approach outlined here might not be better (at least it would certainly be more flexible, and I don't think it would be any less performant). This would not use MultipartInput
, but would allow you to combine mutators for arbitrary Input
types:
let mut stages = tuple_list!(StdMutationalStage::new(StdScheduledMutator::new(
havoc_mutations_mapped(|i: MyCustomInput| -> Vec<u8> {i.extract_required_input_part()})
.extend(havoc_mutations_mapped_optional(|i: MyCustomInput| -> Option<Vec<u8>> {i.extract_optional_input_part()})
.extend(havoc_mutations_mapped_optional(MyCustomInput::exteract_optional_input_part2) // more concise
.append(MyCustomInputToggleOptionalInputPartMutator::new()) // or any other futher mutators
)));
Here's the tuple mapping trait: https://github.com/AFLplusplus/LibAFL/pull/2247
For your proposal: I'm not sure where
HasMutatorBytes
would come in, since this is what the mutators inhavoc_mutations()
relies on?
You can directly return the Vec<u8>
(or the wrapping type, don't remember the name presently) by extracting it from the relevant component. The specifics of your implementation are of course up to you :p We're simply recommending Multipart as we think what you're describing can be made from existing components, but if you have a solution that works then use this.
Can this issue be closed?
I'll try to finish the implementation properly of this and create a PR, but it'll be a few more weeks, so maybe leave it open for now?
I have a custom
Input
type I'm doing fuzzing with. It contains two fields in the shape of aVec<u8>
. I'd love to be able to use havoc mutations on both separately, in addition to further custom mutators. Currently, all havoc mutators rely on theInput
implementingHasBytesVec
, which means that I can't apply them to two parts separately.I'd love to have, next to the now default
havoc_mutations()
, a method which takes a closure mapping between theInput
type and the (mutable) bytes slice used by the mutators.havoc_mutations()
could then just restrict itsInput
s toHasBytesVec
and call the new method with the extraction methods defined in its trait.