AFLplusplus / LibAFL

Advanced Fuzzing Library - Slot your Fuzzer together in Rust! Scales across cores and machines. For Windows, Android, MacOS, Linux, no_std, ...
Other
2.03k stars 319 forks source link

Timeout for LLMP Clients #36

Closed domenukk closed 1 year ago

domenukk commented 3 years ago

LLMP, short for LowLevel Message Passing, has the concept of multiple clients, which are all connected to one broker. The broker broadcasts each new message it receives from a client (over an individual shared) to all other clients via one ...shared... shared map. So far, LLMP Clients never get unregistered. It would be good to keep track of the last time a client sent a message, and unregister them accordingly. This could be done in the once method of the broker (which is called in a loop in the broker) https://github.com/AFLplusplus/LibAFL/blob/6e4c6a9b6b0db909853a19a68ba769bb2516f1bf/libafl/src/bolts/llmp.rs#L1349

On top, we may want to support a good-bye/unregister message from clients when they know they are done. This could be called inside a Drop crate automatically.

aga7hokakological commented 3 years ago

So far, LLMP Clients never get unregistered. It would be good to keep track of the last time a client sent a message, and unregister them accordingly.

what do you mean by unregister and what point should they be removed? is it like removing client from llmp_client vec? will last_message_offset work to keep track or should I use last_message_sent.

domenukk commented 3 years ago

Hey @aga7hokakological , yes they should be removed from the broker's vec after the broker did not receive a new message from them for n seconds. We'll need to keep the last recv time for each client. handle_new_msgs could return the amount of new messages handled, for example.

aga7hokakological commented 3 years ago

Okay got it. I will try to implement.

aga7hokakological commented 3 years ago

I am bit confused here. say if I am adding some time function in LlmpMsg struct to get that time instance and then in once method I am calculating if broker received message or not and then removing it from vec. That's how it should be right? or maybe changing register_client function to get time of last_msg_recvd variable and then comparing it in once function that it haven't received message might work. just want to know what you think.

domenukk commented 3 years ago

The timing should be done locally in the broker IMHO, it should just keep track of when it received the last new message from this client. Less likely to break things, too

domenukk commented 3 years ago

For example, you can have a last_msg_time_for_client: Vec<Duration> vec, and after this line: https://github.com/AFLplusplus/LibAFL/blob/6e4c6a9b6b0db909853a19a68ba769bb2516f1bf/libafl/src/bolts/llmp.rs#L1356 check if you got new messages, then update the current client's time, else check if it timed out.

aga7hokakological commented 3 years ago

Well the problem here is that I am only able to see 2 clients on my machines. Also whenever I am receiving new test cases they are only from client 0. In broker only client is there so the message is passed to broker and then to the other clients. Also I guess I need to use std::time because only core::time doesn't work as I need to have instace there to calculate the elapsed time. But I am not getting how to check for new message as it is a struct and not any other data type such as bool.

domenukk commented 3 years ago

You get as many clients as you spawn; if you run more than two clients, you should see more than two clients (?) else we have a bug somewhere... Yes, the time can only be std

aga7hokakological commented 3 years ago

How should I spawn more than 2 clients? In any real-world scenario, you should use taskset to pin each client to an empty CPU core, the lib does not pick an empty core automatically (yet). as this line states that it should be done manually. But here, RUST_BACKTRACE=full taskset -c 1 ./.libfuzzer_test.elf 2>/dev/null taskset -c 1is binding one CPU. and I see no other option to bind cpu. If there is another way plz tell.

aga7hokakological commented 3 years ago

But I am not getting how to check for new message as it is a struct and not any other data type such as bool.

Also this is quite confusing as it cannot be used directly and I think I'll have to write whole code again in the functions to check new message. or is using message_id fine of the llmp_msg cause then I can keep eye on message with the sender so that way it can be done.

domenukk commented 3 years ago

-c 1 is binding to CPU 1. Just use -c 2, -c 3, ... for other CPUs. The test script is just there for some quick tests, don't be afraid to run the binary manually. So, build the fuzzer and then run it from the target directory multiple times.

WRT new messages, inside handle_new_msgs, you know which client you are dealing with, and in this line: https://github.com/AFLplusplus/LibAFL/blob/6e4c6a9b6b0db909853a19a68ba769bb2516f1bf/libafl/src/bolts/llmp.rs#L1588 you know that a new message arrived, just add the current time to some datastructure if this is the case. Or prune the client, if None was returned. Hope this helps :)

aga7hokakological commented 3 years ago

-c 1 is binding to CPU 1. Just use -c 2, -c 3, ... for other CPUs. The test script is just there for some quick tests, don't be afraid to run the binary manually.

I tried doing this and building but the clients number remain the same.

domenukk commented 3 years ago

Are you sure? Works for me. Maybe add some debug prints?

aga7hokakological commented 3 years ago

Okay got it I had to run it multiple times.

aga7hokakological commented 3 years ago

Can I get pull request access?

domenukk commented 3 years ago

@aga7hokakological gave you access, please push to a branch and open a pr to dev :)

aga7hokakological commented 3 years ago

Hey @domenukk, sorry for disturbing again but target/release/example does not work for me. I have tried on arch-linux as well as on ubuntu. I was able to run once but then it is not working. I tried the way you told me to run it but It just freezes at this.

./libfuzzer_libpng Workdir: "/home/myname/Desktop/LibAFL/target/release/examples" [libafl/src/bolts/llmp.rs:409] "We're the broker" = "We\'re the broker" Doing broker things. Run this tool again to start fuzzing in a client.

Though I have written code but without testing I cannot be able to make a pull request. And with ./test.sh from fuzzer/libfuzzer_libpng it works but because it just spawns 2 clients I am not able to test it. I am trying to resolve it from my side if I am missing anything plz tell.

domenukk commented 3 years ago

Have you tried running it a second time, or multiple times? The first instance of the tool is the broker, the next ones then do the actual fuzzing.

aga7hokakological commented 3 years ago

Yes, I have tried running it multiple times. Almost 10-15 times I tried running but it just freezes.

aga7hokakological commented 3 years ago

It was giving problem because of this panic.

panic!("Fuzzer-respawner: Storing state in crashed fuzzer instance did not work, no point to spawn the next client!");

I solved it. Now working fine and clients are spawning perfectly.

aga7hokakological commented 3 years ago

WRT new messages, inside handle_new_msgs, you know which client you are dealing with, and in this line:

https://github.com/AFLplusplus/LibAFL/blob/6e4c6a9b6b0db909853a19a68ba769bb2516f1bf/libafl/src/bolts/llmp.rs#L1588

you know that a new message arrived, just add the current time to some datastructure if this is the case. Or prune the client, if None was returned.

Here the function handle_new_msgs returns result. So I tried this code but because of the timeout is not perfectly working so it is unable to remove clients.

let mut last_time_msg_sent_for_client = HashMap::new(); compiler_fence(Ordering::SeqCst); for i in 0..self.llmpclients.len() { let time = Instant::now(); if let Ok() = self.handle_new_msgs(i as u32, on_new_msg) { if time.elapsed().as_millis() < duration::new(0, 1).as_millis() { last_time_msg_sent_for_client.insert( self.llmp_out.id, SystemTime::now(), ); } } else { //remove the client if message not sent self.llmp_clients.retain(|client| client.id != i as u32); } }

maybe this line is not necessary

if time.elapsed().as_millis() < duration::new(0, 1).as_millis()

s1341 commented 3 years ago

It was giving problem because of this panic.

panic!("Fuzzer-respawner: Storing state in crashed fuzzer instance did not work, no point to spawn the next client!");

I solved it. Now working fine and clients are spawning perfectly.

How did you solve this?

aga7hokakological commented 3 years ago

Hey @s1341

How did you solve this?

This was the full error:

[libafl/src/bolts/llmp.rs:416] "We're the client" = "We\'re the client" [libafl/src/bolts/llmp.rs:416] e = Os { code: 98, kind: AddrInUse, message: "Address already in use", } Connected to port 1337 [libafl/src/events/llmp.rs:553] "Spawning next client (id {})" = "Spawning next client (id {})" [libafl/src/events/llmp.rs:553] ctr = 0 We're a client, let's fuzz :) First run. Let's set it all up We're a client, let's fuzz :) thread 'main' panicked at 'Failed to load initial corpus at ["./corpus"]: File(Os { code: 2, kind: Not Found, message: "No such file or directory" })', fuzzers/libfuzzer_libpng/./src/fuzzer.rs:176:14 note: run with RUST_BACKTRACE=1 environment variable to display a backtrace thread 'main' panicked at 'Fuzzer-respawner: Storing state in crashed fuzzer instance did not work, no point to spawn the next client!', /home/aga7hokakological/Saurabh/LibAFL/libafl/src/events/llmp.rs:56 8:21 note: run with RUST_BACKTRACE=1 environment variable to display a backtrace

It was because I was running it from wrong directory, also there was error where it wasn't able to find libpng.tar.xz so I downloaded it manually and put it in the folder.

domenukk commented 3 years ago

WRT new messages, inside handle_new_msgs, you know which client you are dealing with, and in this line: https://github.com/AFLplusplus/LibAFL/blob/6e4c6a9b6b0db909853a19a68ba769bb2516f1bf/libafl/src/bolts/llmp.rs#L1588

you know that a new message arrived, just add the current time to some datastructure if this is the case. Or prune the client, if None was returned.

Here the function handle_new_msgs returns result. So I tried this code but because of the timeout is not perfectly working so it is unable to remove clients.

let mut last_time_msg_sent_for_client = HashMap::new(); compiler_fence(Ordering::SeqCst); for i in 0..self.llmpclients.len() { let time = Instant::now(); if let Ok() = self.handle_new_msgs(i as u32, on_new_msg) { if time.elapsed().as_millis() < duration::new(0, 1).as_millis() { last_time_msg_sent_for_client.insert( self.llmp_out.id, SystemTime::now(), ); } } else { //remove the client if message not sent self.llmp_clients.retain(|client| client.id != i as u32); } }

maybe this line is not necessary

if time.elapsed().as_millis() < duration::new(0, 1).as_millis()

a) there is no need for a hashmap here, all clients are stored in a vec with fixed ids, so a vec will work here, too. b) you create a new map each time this function gets called. Instead, store the state in self so that it survives across calls.

aga7hokakological commented 3 years ago

a) there is no need for a hashmap here, all clients are stored in a vec with fixed ids, so a vec will work here, too. b) you create a new map each time this function gets called. Instead, store the state in self so that it survives across calls.

I did but even if handle_new_msg is result the client.recv() is Option as I need some(msg) => msg but I cannot access it outside the function. But to say it contains Ok(()) of the handle_new_msg it means it has been handled so it should be pruned. My code ends as soon as I spawn clients.

domenukk commented 3 years ago

Sorry, I don't really understand what you try to do here? Is it roughly what I described in https://github.com/AFLplusplus/LibAFL/issues/36#issuecomment-808882592 ?

aga7hokakological commented 3 years ago

I have declared another should_unregister variable in broker as bool. Also added this code with some changes (just checking if self.should_unregister is true or false) to above code to check if it is some or none in handle_new_msg below msg and it works but when it is none means message have arrived it means it should add time to the vec rather it removes the client. that's the only problem remaining. I mean it works but in opposite way.

if msg.as_ref().is_none() { self.should_unregister = true; } else { self.should_unregister = false; }

if I interchange the true/false then it removes when msg is none but that should not be the case right?

Srg213 commented 1 year ago

Hello. I have made few changes to LlmpBroker struct to add Vec<Duration> so that the data is stored in self itself. For testing the changes, i built the libfuzzer_libpng example as mentioned. But the following make command is throwing an error. make CC="$(pwd)/../target/release/libafl_cc" CXX="$(pwd)/../target/release/libafl_cxx" -j ``nproc `

The error message is thread 'main' panicked at 'Failed to run the wrapped compiler: Io(Os { code: 2, kind: NotFound, message: "No such file or directory" })', src/bin/libafl_cc.rs:29:14

I found that "$(pwd)/../target/release/libafl_cc" is a binary but not a directory. The exact path of binary is - fuzzers/libfuzzer_libpng/target/release/libafl_cc. As per the instructions, I think I am building it in the correct directory. -fuzzers/libfuzzer_libpng/libpng-1.6.37. Can you help me regarding this?

tokatoka commented 1 year ago

You can run cargo make run to get it working

Srg213 commented 1 year ago

Actually I used that command first. Then to check where the build breaks, I followed instructions from README.md Yet I tried it again today. It is still showing the same error.

tokatoka commented 1 year ago

Do you have clang installed?

Srg213 commented 1 year ago

Really thanks for the help. I have changed the struct LlmpBroker to add Vec<Instant> to store the last_msg_time_for_clients. For now I added a random time duration after which clients are removed. Here are the changes. If the above changes are fine, could I get a PR access?
Can you specify the exact time duration? and regarding the following

On top, we may want to support a good-bye/unregister message from clients when they know they are done. This could be called inside a Drop crate automatically.

I guess adding impl Drop for LlmpReceiver might be the correct way. Should I add this trait ?

domenukk commented 1 year ago

Really thanks for the help. I have changed the struct LlmpBroker to add Vec<Instant> to store the last_msg_time_for_clients. For now I added a random time duration after which clients are removed. Here are the changes. If the above changes are fine, could I get a PR access?

I think the changes are not enough - we may have to do more things to safely unmap, for example we may want to tell the client that we unmapped it, in case it is still alive. but it is a good start. You should just be able to open a PR, no need for any additional permissions(?)

Can you specify the exact time duration?

Let's do 30 seconds for now, we can adjust it later.

and regarding the following

On top, we may want to support a good-bye/unregister message from clients when they know they are done. This could be called inside a Drop crate automatically.

I guess adding impl Drop for LlmpReceiver might be the correct way. Should I add this trait ?

Sounds good 👍

Srg213 commented 1 year ago

Thanks for the advice. To tell the client it is unmapped, a new message could be added on the shared map of the broker itself but it will broadcast to all clients. I am finding a way to send message specifically to that receiver which is going to be unmapped. This could be done in Drop trait or in the once function itself. Wait I think adding Drop trait for LlmpClient would make more sense cause Llmpreceiver is only an receiving end. Is this the correct approach?

domenukk commented 1 year ago

I think we the broker could set a bit in the incoming map for this, as well

domenukk commented 1 year ago

@Srg213 note that #1057 introduces a client timeout, it's not perfect (pages may not properly be cleaned up, client's don't notice when the broker is gone), but it's a start

Srg213 commented 1 year ago

Okay. I see many changes done for timeout like last_msg_time is added in Receiver side. But as you said client's don't notice the broker gone and an unregister message for clients is not added ig. Regarding cleaning the pages, would it be feasible to include a parameter to know when either sender or receiver is dropped so that the page could be dropped? The commits include a lot of changes to the project. Thank you for bringing it to my notice.

domenukk commented 1 year ago

So the original concept (never implemented) was to set sender_dead to 1, here. I think it could make sense to introduce an (atomic) ref count how many users a page currently has, and clean up accordingly(?) https://github.com/AFLplusplus/LibAFL/blob/0727c80347d13f5adc05c77228fb0d6dc7de8c41/libafl/src/bolts/llmp.rs#L785

I don't have a perfect answer handy, else I'd have implemented it :)

Srg213 commented 1 year ago

yes adding atomic reference count Arc would make sense. In this case, I think we can create a iterative function to check if sender_dead == 1 which might call Drop trait on Llmppage and clean accordingly. This would incur runtime costs as we would have to check iteratively. Or I guess we can introduce an additional wrapper object around Llmppage which has variable of type Arc::. This could then be cloned as per the number of users. This will reduce the use of unsafe operations while referencing pages and can be easily dropped too. But it requires a lot of changes to the repo.

domenukk commented 1 year ago

Sounds good. A few things that come to my mind when thinking about it:

page 1 page 2 page 3
^receiver reads this page. ^only mapped by sender ^sender writes to this page
Srg213 commented 1 year ago

Hi. I have tried adding refcount to check the number of users mapped to a page. But I am not clear whether the pages are always pruned and reallocated.

domenukk commented 1 year ago

This is now done :)