profuzzbench / profuzzbench

ProFuzzBench - A Benchmark for Stateful Protocol Fuzzing
https://arxiv.org/pdf/2101.05102.pdf
Apache License 2.0
224 stars 52 forks source link

AFLnwe seeds timestamps are non-sequential #6

Open acidghost opened 3 years ago

acidghost commented 3 years ago

While doing some testing I noticed that the timestamps of the seeds stored in the queue folder of AFLnwe runs are non-sequential:

$ ll bftpd/out-bftpd-aflnwe-3/queue | head -n50
-rw-rw-r-- 1 acidghost acidghost   86 Mar  3 17:26 id:000000,orig:seed_1.raw
-rw------- 1 acidghost acidghost  108 Mar  5 13:05 id:000001,orig:seed_10.raw
-rw-rw-r-- 1 acidghost acidghost  109 Mar  3 17:26 id:000002,orig:seed_11.raw
-rw------- 1 acidghost acidghost  108 Mar  5 13:17 id:000003,orig:seed_12.raw
-rw------- 1 acidghost acidghost   56 Mar  5 13:07 id:000004,orig:seed_13.raw
-rw-rw-r-- 1 acidghost acidghost  102 Mar  3 17:26 id:000005,orig:seed_2.raw
-rw------- 1 acidghost acidghost  107 Mar  5 13:08 id:000006,orig:seed_3.raw
-rw-rw-r-- 1 acidghost acidghost  120 Mar  3 17:26 id:000007,orig:seed_4.raw
-rw------- 1 acidghost acidghost  139 Mar  5 13:10 id:000008,orig:seed_5.raw
-rw------- 1 acidghost acidghost   44 Mar  5 13:10 id:000009,orig:seed_6.raw
-rw------- 1 acidghost acidghost   44 Mar  5 13:10 id:000010,orig:seed_7.raw
-rw------- 1 acidghost acidghost   40 Mar  5 13:11 id:000011,orig:seed_8.raw
-rw------- 1 acidghost acidghost   40 Mar  5 13:11 id:000012,orig:seed_9.raw
-rw------- 1 acidghost acidghost    1 Mar  5 13:04 id:000013,src:000000,op:havoc,rep:64,+cov
-rw------- 1 acidghost acidghost   67 Mar  5 13:04 id:000014,src:000000,op:havoc,rep:8,+cov
-rw------- 1 acidghost acidghost   90 Mar  5 13:04 id:000015,src:000000,op:havoc,rep:8
-rw------- 1 acidghost acidghost  108 Mar  5 13:04 id:000016,src:000000,op:havoc,rep:4,+cov
-rw------- 1 acidghost acidghost   78 Mar  5 13:23 id:000017,src:000000,op:havoc,rep:4
-rw------- 1 acidghost acidghost   65 Mar  5 13:04 id:000018,src:000000,op:havoc,rep:2
-rw------- 1 acidghost acidghost   57 Mar  5 13:04 id:000019,src:000000,op:havoc,rep:32
-rw------- 1 acidghost acidghost   86 Mar  5 13:04 id:000020,src:000000,op:havoc,rep:2
-rw------- 1 acidghost acidghost   73 Mar  5 13:04 id:000021,src:000000,op:havoc,rep:4,+cov
-rw------- 1 acidghost acidghost   86 Mar  5 13:04 id:000022,src:000000,op:havoc,rep:8
-rw------- 1 acidghost acidghost  125 Mar  5 13:04 id:000023,src:000000,op:havoc,rep:16
-rw------- 1 acidghost acidghost   58 Mar  5 13:11 id:000024,src:000000,op:havoc,rep:4,+cov
-rw------- 1 acidghost acidghost   85 Mar  5 13:04 id:000025,src:000000,op:havoc,rep:16
-rw------- 1 acidghost acidghost   86 Mar  5 13:04 id:000026,src:000000,op:havoc,rep:2
-rw------- 1 acidghost acidghost  113 Mar  5 13:04 id:000027,src:000000,op:havoc,rep:8
-rw------- 1 acidghost acidghost  122 Mar  5 13:04 id:000028,src:000000,op:havoc,rep:32
-rw------- 1 acidghost acidghost   62 Mar  5 13:04 id:000029,src:000000,op:havoc,rep:32
-rw------- 1 acidghost acidghost   90 Mar  5 13:04 id:000030,src:000000,op:havoc,rep:8
-rw------- 1 acidghost acidghost   86 Mar  5 13:04 id:000031,src:000000,op:havoc,rep:2
-rw------- 1 acidghost acidghost  104 Mar  5 13:04 id:000032,src:000000,op:havoc,rep:16
-rw------- 1 acidghost acidghost   76 Mar  5 13:04 id:000033,src:000000,op:havoc,rep:32
-rw------- 1 acidghost acidghost   77 Mar  5 13:04 id:000034,src:000000,op:havoc,rep:8
-rw------- 1 acidghost acidghost   91 Mar  5 13:04 id:000035,src:000000,op:havoc,rep:64
-rw------- 1 acidghost acidghost   97 Mar  5 13:04 id:000036,src:000000,op:havoc,rep:16
-rw------- 1 acidghost acidghost   86 Mar  5 13:04 id:000037,src:000000,op:havoc,rep:2
-rw------- 1 acidghost acidghost   14 Mar  5 13:04 id:000038,src:000000,op:havoc,rep:128
-rw------- 1 acidghost acidghost   80 Mar  5 13:11 id:000039,src:000000,op:havoc,rep:16,+cov
-rw------- 1 acidghost acidghost   76 Mar  5 13:04 id:000040,src:000000,op:havoc,rep:4
-rw------- 1 acidghost acidghost   63 Mar  5 13:04 id:000041,src:000000,op:havoc,rep:8
-rw------- 1 acidghost acidghost   16 Mar  5 13:04 id:000042,src:000000,op:havoc,rep:32
-rw------- 1 acidghost acidghost    7 Mar  5 13:04 id:000043,src:000000,op:havoc,rep:32
-rw------- 1 acidghost acidghost   36 Mar  5 13:04 id:000044,src:000000,op:havoc,rep:64
-rw------- 1 acidghost acidghost  100 Mar  5 13:04 id:000045,src:000000,op:havoc,rep:8
-rw------- 1 acidghost acidghost   78 Mar  5 13:12 id:000046,src:000000,op:havoc,rep:4,+cov
-rw------- 1 acidghost acidghost   89 Mar  5 13:04 id:000047,src:000000,op:havoc,rep:8
-rw------- 1 acidghost acidghost   86 Mar  5 13:04 id:000048,src:000000,op:havoc,rep:2

The timestamps for the initials seeds (i.e. ending in .raw) are also not updated while pivoting them from the input seeds folder (see above as the dates are 3 and 5 Mar): https://github.com/aflnet/aflnwe/blob/113102a3ba552028e6fb0193cc2039503def7ef4/afl-fuzz.c#L3303

When generating plots the cut-off time may turn out wrong because it's based on the timestamp of the first initial (linked) seed:

https://github.com/profuzzbench/profuzzbench/blob/9962025db43cf93a261c22aee70c29f867ea2058/scripts/analysis/profuzzbench_plot.py#L37-L40 https://github.com/profuzzbench/profuzzbench/blob/9962025db43cf93a261c22aee70c29f867ea2058/subjects/FTP/BFTPD/cov_script.sh#L33-L34 https://github.com/profuzzbench/profuzzbench/blob/9962025db43cf93a261c22aee70c29f867ea2058/subjects/RTSP/Live555/cov_script.sh#L33-L34 https://github.com/profuzzbench/profuzzbench/blob/9962025db43cf93a261c22aee70c29f867ea2058/subjects/FTP/LightFTP/cov_script.sh#L40-L41 (this is the same for all targets/subjects)

This bug can cause the plots for AFLnwe to be totally wrong as they're cut too short. In the example above the difference in timestamps from the first seed and the others is in days (the run was 1 hour long), causing the plot for AFLnwe to only pick up some of the initial seeds.

This does not seem to be the case for AFLnet as it uses the replayable-queue where files are created from scratch.

thuanpv commented 3 years ago

Hi @acidghost,

Thank you for reporting these issues! As you pointed out, the old timestamp issue is caused by the link_or_copy function -- if a seed is linked to the queue folder, its timestamp is not updated. I have tried to modify the function so that it always copies seeds and the issue seems to be fixed.

Edited:

Thuan

thuanpv commented 3 years ago

I have completed an experiment for LightFTP, using the same commands written in README, and the graph shows a bit better result for AFLNwe. See the attached figure. cov_over_time

acidghost commented 3 years ago

I can confirm that the timestamps from the ls above was from an extracted archive and not from the container (cannot verify the contents of the container as I have deleted them).

If I try to check the timestamps in the generated CSV, I still see incoherent values (at least in the beginning): window_2021-03-15-14-44-25

Edit: I count 17 unexpected timestamps in this specific run

❯ awk -F, 'BEGIN {cnt=0;prev=0}; $4 == 3 && $3 == "aflnwe" && $5 == "b_abs" {if (prev > $1) { print(strftime("%T %F", $1)); cnt++} prev=$1}; END {print cnt}' bftpd/results.csv
17:26:56 2021-03-03
13:07:01 2021-03-05
17:26:56 2021-03-03
17:26:56 2021-03-03
13:07:01 2021-03-05
13:04:08 2021-03-05
13:04:13 2021-03-05
13:04:21 2021-03-05
13:04:41 2021-03-05
13:05:28 2021-03-05
13:07:07 2021-03-05
13:07:28 2021-03-05
13:09:17 2021-03-05
13:11:48 2021-03-05
13:16:50 2021-03-05
13:43:37 2021-03-05
13:49:36 2021-03-05
17
thuanpv commented 3 years ago

Can you please rebuild the Docker image and run the experiments again? And please let me know if you still observe the issues. Thanks.

acidghost commented 3 years ago

@thuanpv It seems like the issue is still there. The seeds don't seem to have extremely old timestamps, but the timestamps are still non-sequential (even some seeds have wrong timestamp):

$ for f in queue/id*; do echo "$(date -r $f +'%F %T %N') : $f"; done
2021-03-19 11:58:43 000000000 : queue/id:000000,orig:seed_1.raw
2021-03-19 12:00:35 000000000 : queue/id:000001,orig:seed_10.raw
2021-03-19 11:58:43 000000000 : queue/id:000002,orig:seed_11.raw
2021-03-19 11:58:43 000000000 : queue/id:000003,orig:seed_12.raw
2021-03-19 12:01:44 000000000 : queue/id:000004,orig:seed_13.raw
2021-03-19 11:58:43 000000000 : queue/id:000005,orig:seed_2.raw
2021-03-19 12:03:35 000000000 : queue/id:000006,orig:seed_3.raw
2021-03-19 11:58:43 000000000 : queue/id:000007,orig:seed_4.raw
2021-03-19 11:58:43 000000000 : queue/id:000008,orig:seed_5.raw
2021-03-19 12:05:52 000000000 : queue/id:000009,orig:seed_6.raw
2021-03-19 12:06:03 000000000 : queue/id:000010,orig:seed_7.raw
2021-03-19 11:58:43 000000000 : queue/id:000011,orig:seed_8.raw
2021-03-19 11:58:43 000000000 : queue/id:000012,orig:seed_9.raw
2021-03-19 11:58:52 000000000 : queue/id:000013,src:000000,op:havoc,rep:128,+cov
2021-03-19 11:58:52 000000000 : queue/id:000014,src:000000,op:havoc,rep:16,+cov
2021-03-19 11:58:53 000000000 : queue/id:000015,src:000000,op:havoc,rep:8,+cov
2021-03-19 12:06:42 000000000 : queue/id:000016,src:000000,op:havoc,rep:32,+cov
2021-03-19 11:58:53 000000000 : queue/id:000017,src:000000,op:havoc,rep:32
2021-03-19 12:06:53 000000000 : queue/id:000018,src:000000,op:havoc,rep:8
2021-03-19 11:58:54 000000000 : queue/id:000019,src:000000,op:havoc,rep:8
2021-03-19 11:58:54 000000000 : queue/id:000020,src:000000,op:havoc,rep:32
2021-03-19 11:58:55 000000000 : queue/id:000021,src:000000,op:havoc,rep:2,+cov
2021-03-19 11:58:55 000000000 : queue/id:000022,src:000000,op:havoc,rep:8
2021-03-19 11:58:56 000000000 : queue/id:000023,src:000000,op:havoc,rep:4
2021-03-19 11:58:56 000000000 : queue/id:000024,src:000000,op:havoc,rep:2
2021-03-19 11:58:57 000000000 : queue/id:000025,src:000000,op:havoc,rep:64
thuanpv commented 3 years ago

Thanks @acidghost. Interesting! In AFLnwe, we did not make any changes to AFL regarding its code to save test cases. Let us look into it more deeply.

rnatella commented 3 years ago

I did some investigation on this issue. When the file is created in the "queue" folder for the first time, it has the correct ctime. When AFL saves a new interesting input, the ctime for the "src" input file also gets updated. Note that the "src" input file is mmap-ed (even if the fd is read-only), and that it is munmap-ed by abandon_entry() when the new interesting input is saved.

It seems that timestamps with mmap are not really reliable, see also https://yarchive.net/comp/linux/mtime_mmap.html or https://apenwarr.ca/log/20181113 . I did not yet check why this issue surfaced with AFLnwe. As for the solution, we may want to save a second copy of the interesting inputs in another folder, or we may include the timestamp in the file name (but this may break some script).

rnatella commented 3 years ago

After more debugging, it turns out that there is no weird problem with mmap() and timestamps (fortunately!). When a new interesting input is found, it is immediately enqueued, but it is trimmed only when it is dequeued for further fuzzing. The trimmed test case is saved back on the disk, thus updating the timestamp.

I wrote a patch for AFLnwe to add a new option to take control of this. If you pass -z '-trimmed', it will save the trimmed test case in a separate file, with the same name of the original file suffixed with '-trimmed'. If you pass -z '', the trimmed test case is not saved. We may want to pass -z '' in run.sh when calling afl-fuzz.

@thuanpv can you kindly check the attached patch, and commit it on AFLnwe? I don't have permissions on the repo for a pull request.

save-trimmed-test-cases.patch.zip

maxammann commented 1 year ago

Uh this made me spend hours today on investigating. Sadly I only found this now.

Here is the patch for better visibility: https://github.com/trail-of-forks/aflnwe/commit/69cd1eb3b2fac7eeda2d248ffdf2b08e54cf1750

So as I can see by setting -z to an empty string no more trimmed test cases are written. That should work! I'll test this!