kcov(4) assisted fuzzing kernel

krytarowski commented 6 years ago

We are about to port kcov(4) to the NetBSD kernel, for use by fuzzers such as syzkaller and AFL/honggfuzz.

As an exercise I want to add an option to honggfuzz for fuzzing the ELF kernel loader.

I want honggfuzz to:

monitor dmesg for kUBSan and kASan reports,
receive feedback from a kcov(4) (SanCov) device,
generate random ELF file with just correct e_ident[0...4] in the beginning (magic number) and the rest randomized
execve(2) that random generated file, and stop on SIGTRAP / TRAP_EXEC or some error returned from execve(2); the new image will be loaded by the kernel, and never running in user-space as we want to SIGKILL immediately (PT_KILL).

How to specify this mode in command line arguments to honggfuzz? I am open to other suggestions how to design this mode. What SanCov features are must-have? I'm inspired by the list in hfuzz-cc.c

    if (isGCC) {
        if (useGccGE8()) {
            /* gcc-8 offers trace-cmp as well, but it's not that widely used yet */
            args[(*j)++] = "-fsanitize-coverage=trace-pc,trace-cmp";
        } else {
            /* trace-pc is the best that gcc-6/7 currently offers */
            args[(*j)++] = "-fsanitize-coverage=trace-pc";
        }
    } else {
        args[(*j)++] = "-Wno-unused-command-line-argument";
        args[(*j)++] = "-fsanitize-coverage=trace-pc-guard,trace-cmp,trace-div,indirect-calls";
        args[(*j)++] = "-mllvm";
        args[(*j)++] = "-sanitizer-coverage-prune-blocks=0";
        args[(*j)++] = "-mllvm";
        args[(*j)++] = "-sanitizer-coverage-level=3";
    }

krytarowski commented 6 years ago

CC: @dvyukov for syzkaller insight. Is there something that is missing in Linux's kcov(4) or could be done better?

CC: @vegard for AFL insight. What's the status of your patch to upstream ALF interface to kcov? Is it abandoned? https://lore.kernel.org/patchwork/patch/680373/ Is there still need for /dev/afl as mentioned in https://events.static.linuxfound.org/sites/events/files/slides/AFL%20filesystem%20fuzzing%2C%20Vault%202016_0.pdf

CC: @r3x who is in the process of porting kcov(4) to NetBSD.

vegard commented 6 years ago

@krytarowski The kernel patch has not been accepted/included (yet), I don't think we made an attempt to resubmit it but it's been a while and I honestly don't remember. I think it can be considered abandoned for now, although I think it would probably take a very tiny effort to fix it up and get it in. Our patches to AFL itself were never accepted upstream either (or we wanted to wait for the kernel patch to get included first). On the other hand, it's really easy to make AFL use the vanilla kcov interface (although it should be slightly slower, in theory at least)...

To give my own answer for your question to @dvyukov, I think it would have been cool to support both location and operand collection in the same trace. Currently I think you need to choose between them (or run your code twice under different modes).

krytarowski commented 6 years ago

@vegard thank you for your feedback!

If we can make the device/interface better we can marge it with our tprof(4) utility and reuse the /dev/tprof device. It has some features such as handling larger chunks of data too big for mmap(2).. but not sure if it would be needed for SanitizerCoverage. If we will end up with the feature set of Linux we will rather struggle for compat with the existing API.

Additionally contrary to Linux, we can use the whole power of Clang/LLVM and go beyond GCC feature set. Although currently Kernel Sanitizers were ported against GCC (but handling them with Clang/LLVM shall be low effort if any).

dvyukov commented 6 years ago

Is there something that is missing in Linux's kcov(4) or could be done better?

Nothing large I can think of. One problem is that it's hard to figure out what's size of PCs in the trace from a 32-bit binary, it can be both 32-bits and 64-bits depending on kernel. Assuming you still care about 32-bits of course. 2 solutions that I see: (1) do 64-bits always, however 32-bit kernel probably cares about memory consumption much more than a 64-bit kernel, or (2) provide some explicit ioctl that says what's the format.

I think it would have been cool to support both location and operand collection in the same trace.

As far as I remember why we did separate collection is that we did not figure out a clean (read -- upstreamable) way to do combined collection. We can prefix each element with another word that contains type (PC/operands), but that double size for PCs. Or we could, say, set high bit on PCs, but that would encounter some questions upstream. In the end, I think separate collection is probably better because we don't want operands for every test case (processing operands then produces thousands of new candidates and we want to do this each time). We collect operands only for inputs added to corpus, which is rare, but otherwise we just collect trace.

krytarowski commented 6 years ago

@dvyukov thank you for your feedback!

Regarding the 32-bit application on 64-bit kernel we will research it. We have got some similar issues with at least ktrace(2).

We collect operands only for inputs added to corpus, which is rare, but otherwise we just collect trace.

How about trace-div or indirect-calls?

We have got a scratch version of kcov(4) with sole trace (pc). We have got functional kUBSan and kASan (e.g. https://github.com/NetBSD/src/commit/8968d7942f644782541c458b1bce308fbc986566).. probably everything aboard for fuzzing a kernel... OK still missing qemu-kvm so I will need to port it before using syzkaller myself - but it shall be taken by one of our developers sooner!

dvyukov commented 6 years ago

How about trace-div or indirect-calls?

We don't use them in syzkaller at the moment.

We have got a scratch version of kcov(4) with sole trace (pc). We have got functional kUBSan and kASan (e.g. NetBSD/src@8968d79).. probably everything aboard for fuzzing a kernel...

Sounds great!

OK still missing qemu-kvm so I will need to port it before using syzkaller myself - but it shall be taken by one of our developers sooner!

Do you mean changes to qemu itself? or syzkaller vm/qemu package? Does NetBSD work on GCE? If yes, we could start looking at deploying syzbot instance.

krytarowski commented 6 years ago

Do you mean changes to qemu itself? or syzkaller vm/qemu package?

I mean, Linux-like kvm device for hardware assisted virtualization. Right now the only option is Xen, but it's rather not suited for desktops. I've started porting Intel HAXM to NetBSD... but I had to reschedule it as it was too complex for a spare time effort.

Fuzzing with software emulation would be wasting CPU cycles.

I plan to work on it after getting my current toolchain work done.

Does NetBSD work on GCE? If yes, we could start looking at deploying syzbot instance.

There is support through https://github.com/google/netbsd-gce

Do you mean that we could run NetBSD on a Google machine? It would be great however we need to finish kcov(4) (probably within days) and we will pop up in the syzkaller repo. syzkaller is currently a spare time effort on our side.. so it's not progressing as quickly as it could be otherwise.

Personally, I won't start coding on syzkaller or trinity before getting kvm-like device aboard to NetBSD as host. As an exercise I will give a try to AFL/honggfuzz device fuzzing a kernel directly/natively.

dvyukov commented 6 years ago

Do you mean that we could run NetBSD on a Google machine?

Yes, we can host whole testing infrastructure on GCE. However, we will need few additional things like teaching syzkaller how to checkout and build NetBSD kernel (obviously required for continuous testing). See this post re OpenBSD (effective the same): https://groups.google.com/d/msg/syzkaller/50S8wRrPQzM/O09vNE-TAAAJ I assume that NetBSD can't be built on a linux machine, right? We can create a "master" machine running some fixed NetBSD version on GCE, and that machine will build test NetBSD kernels and create test VMs.

I understand that resource are limited on your side, but just to note: this is completely orthogonal to KCOV/KASAN. Even if we set it up without KCOV/KASAN, it will already start finding and reporting bugs, and it will allow to shake out various things in parallel with kernel work.

krytarowski commented 6 years ago

I will forward this internally. We will certainly need support from your side to setup it.

NetBSD can be built (theoretically) on any POSIX-like OS. This is one of the distinct properties of NetBSD.

Building process would be:

./build.sh tools
./build.sh kernel=GENERIC # modified or a distinct config with kUBSan, kASan and kCov enabled

dvyukov commented 6 years ago

Building on linux will definitely make things easier. Note note: we need to build not just kernel, but a whole image that itself can run on GCE. We have this logic for linux: fdisk, mkfs.ext4, copy pre-packaged user-space, copy kernel, install grub. I don't know if something similar will work for NetBSD. But if yes, then it will make things even simpler.

krytarowski commented 6 years ago

This will be possible, but it will need dedicated scripting. We are already building releases and generating ram disk or iso images on any (reasonable) platform through ./build.sh release (and its subtargets).

krytarowski commented 6 years ago

@R3x could you please have a look at this? With some sort of luck we don't need to do anything else beyond putting some extra syzkaller files for a fuzzed kernel (executor).

krytarowski commented 6 years ago

http://m00nbsd.net/4e0798b7f2620c965d0dd9d6a7a2f296.html Just a short update NetBSD gets native VMM API, right now just AMD CPUs, Intel ones are progress and qemu as frontend is in progress too.

krytarowski commented 5 years ago

We got a functional hardware assisted virtualization now for Intel CPUs with HAXM.

http://blog.netbsd.org/tnf/entry/the_hardware_assisted_virtualization_challenge

I've submitted an entry for AFL+kcov(4) fuzzing for GSoC and I have a student who is evaluating this project.

http://wiki.netbsd.org/projects/project/afl_filesystem_fuzzing/

If that will be done we can reuse it as a foundation for honggfuzz too.

robertswiecki commented 5 years ago

Cool, sounds promissing!

krytarowski commented 5 years ago

We have a person working on kcov(4)-based fuzzing on top of the http://wiki.netbsd.org/projects/project/afl_filesystem_fuzzing/ project.

Initial plan is to start with AFL, but we intend to experiment with honggfuzz here.

We intend to extend our kcov(4) driver for AFL. For performance reasons a translation between kcov(4) and AFL format shall not be achieved in userspace.

Questions:

What is the preferred honggfuzz coverage format? Is it fine to use AFL style hashmap? kcov(4)-style with full track? Something else?
We are wondering whether AFL supports other coverage than PC. Whether to implement a mode for AFL_PC, AFL_CMP, AFL_DIV, etc?
Would it be useful to support a dynamic size of hashmap? In the current kcov(4) implementations some syscalls are really deep.. sync(2) can gather something like 11M entries (almost 100MB of trace in kcov(4) format).

dvyukov commented 5 years ago

Re 3. I suspect what KCOV produces is not what is stored in the hashmap. KCOV produces a trace, which can have lots of duplicates. Most fuzzers are usually interested in non-duplicated PCs. If you dedup that sync trace, you should get few thousands at most.

krytarowski commented 5 years ago

@dvyukov I see! 99,9% of calls for sync are repeated calls in VFS, mutexes.

printf(9) is like 7k entries in the kcov(4) trace.

Is syzkaller interested in deduplicated PC entries only? The NetBSD kernel is a litte bit more noisy than Linux and running with 256MB kcov(4) buffer is probably an overkill for performance. We could optimize it on the kernel side and register only unique entries (rb tree) or almost unique ones (using a hashmap).

krytarowski commented 5 years ago

CC @mgrochow who works on this (but not sure if this is his current account).

dvyukov commented 5 years ago

Interesting question. For main fuzzing syzkaller dedups the trace and is only interested in unique PC pairs at the moment. We also have some tools that can dump the trace as is, which is sometimes useful for debugging (e.g. you can figure out the exact execution path in the kernel). But there is a number of ways how coverage can be aggregated. E.g. afl counters mode counts number of hits for each PC, we do hash of 2 adjacent PCs, but it may also be useful to do hashes of longer paths. What can also be used is full stack trace, or caller-callee PC pairs. Trace is the most flexible and allows to extract different kinds of secondary data. That's why we decided to go with the trace and don't want to lose it. But having said that, we could add a special KCOV mode that does roughly what syzkaller currently does in user-space. Then this will be a pure optimization. Later we could add another special mode when/if we want to do some other aggregation, or switch back to raw trace.

krytarowski commented 5 years ago

Thank you for your explanation.

Initially we planned to reproduce Oracle's work on AFL+KCOV [1] with a userspace translator of kcov(4) raw trace into AFL format.. but with very deep traces for millions of entries it's inefficient.

We will keep researching.

[1] https://events.static.linuxfound.org/sites/events/files/slides/AFL%20filesystem%20fuzzing%2C%20Vault%202016_0.pdf

dvyukov commented 5 years ago

The AFL-style hashtable was meant to be implemented from day one, and Quentin even mailed patches, but it was never upstreamed: https://lkml.org/lkml/2016/11/16/668

krytarowski commented 5 years ago

Important unanswered question for us is whether AFL can use !PC (namely: CMP, DIV) traces as it will allow us to make better API and design choices in kcov(4).

Even if not, then using such traces in honggfuzz is still an added value.

robertswiecki commented 5 years ago

From experience, cmp buys a lot of coverage, which otherwise must be obtained via dictionaries of magic values.

I'm uncertain about what AFL supports, AFAIK it's something like (prev_pc << 16) & (pc & 0xFFFF)), but I looked at it only a long time ago. I'm also not sure how much its author is interested in updating AFL's source code, as he changed jobs a year ago, and as far as I know he didn't updated the code since.

gotoco commented 5 years ago

Hi @dvyukov, thank you for your response.

I wanted to show first what we faced in terms of long stack traces, and why that might be something that we may improve. However,, it is not necessarily an issue that other folks (outside of NetBSD community) might be deeply interested (So you can feel free to jump to the second part) in. Then in the second part of this comment, I will go over some question that we have regarding of way of storing trace for the fuzzer.

First part long stack traces (or PC shadow trace)

Mainly we figure out that in our tracing we get much more UVM code that we expected. To present that, I will show a simple test and its output: Here we have kcov test C file we enable tracing, run read system call, stop tracing, and print output to stdout.

https://gist.github.com/gotoco/5e3a88e671f5377e2d18b4fe8c473c16

Running this code as ./kcov_test | addr2line -e /netbsd.gdb > output.txt produce 1M file, which you can see here

https://gist.github.com/gotoco/e9aafa295d340c55c077ff514c035177

At the very beginning, this trace contains syscall path, from syscall code to the userret

/root/workspace/src/sys/arch/x86/x86/syscall.c:107
/root/workspace/src/sys/arch/amd64/compile/obj/GENERIC/./machine/cpu.h:70
/root/workspace/src/sys/arch/amd64/compile/obj/GENERIC/./machine/cpu.h:71
/root/workspace/src/sys/arch/x86/x86/syscall.c:111
/root/workspace/src/sys/arch/x86/x86/syscall.c:138
/root/workspace/src/sys/sys/syscallvar.h:75
/root/workspace/src/sys/sys/syscallvar.h:75 (discriminator 4)
/root/workspace/src/sys/sys/syscallvar.h:75 (discriminator 6)
/root/workspace/src/sys/sys/syscallvar.h:84 (discriminator 1)
/root/workspace/src/sys/sys/syscallvar.h:84 (discriminator 3)
/root/workspace/src/sys/sys/syscallvar.h:84 (discriminator 6)
/root/workspace/src/sys/sys/syscallvar.h:86
/root/workspace/src/sys/sys/syscallvar.h:64
/root/workspace/src/sys/kern/sys_generic.c:110
/root/workspace/src/sys/kern/kern_descrip.c:375
/root/workspace/src/sys/arch/amd64/compile/obj/GENERIC/./machine/cpu.h:70
/root/workspace/src/sys/arch/amd64/compile/obj/GENERIC/./machine/cpu.h:71
/root/workspace/src/sys/kern/kern_descrip.c:378
/root/workspace/src/sys/kern/kern_descrip.c:415
/root/workspace/src/sys/kern/sys_generic.c:113
/root/workspace/src/sys/kern/sys_generic.c:121
/root/workspace/src/sys/sys/syscallvar.h:68
/root/workspace/src/sys/sys/syscallvar.h:97
/root/workspace/src/sys/sys/syscallvar.h:97 (discriminator 2)
/root/workspace/src/sys/sys/syscallvar.h:97 (discriminator 4)
/root/workspace/src/sys/sys/syscallvar.h:97 (discriminator 6)
/root/workspace/src/sys/sys/syscallvar.h:100
/root/workspace/src/sys/sys/syscallvar.h:100
/root/workspace/src/sys/arch/x86/x86/syscall.c:145
/root/workspace/src/sys/arch/x86/x86/syscall.c:159
/root/workspace/src/sys/arch/x86/x86/syscall.c:166
/root/workspace/src/sys/arch/amd64/compile/obj/GENERIC/./machine/userret.h:81
/root/workspace/src/sys/sys/userret.h:83
/root/workspace/src/sys/sys/userret.h:94
/root/workspace/src/sys/sys/userret.h:97
/root/workspace/src/sys/arch/amd64/compile/obj/GENERIC/./machine/cpu.h:80
/root/workspace/src/sys/sys/userret.h:119
/root/workspace/src/sys/sys/userret.h:120

But then we get more tracing from the kernel, which may be the fault for user memory in the context of our task, or we just got more code before we return to the userspace. I am not 100% sure here, definitely we need to investigate it more.

/root/workspace/src/sys/arch/amd64/amd64/trap.c:264
/root/workspace/src/sys/arch/amd64/compile/obj/GENERIC/./machine/cpu.h:70
/root/workspace/src/sys/arch/amd64/compile/obj/GENERIC/./machine/cpu.h:71
/root/workspace/src/sys/arch/amd64/amd64/trap.c:277
/root/workspace/src/sys/sys/lwp.h:300
/root/workspace/src/sys/sys/lwp.h:300
/root/workspace/src/sys/arch/amd64/amd64/trap.c:286
...
/root/workspace/src/sys/arch/amd64/amd64/trap.c:512
/root/workspace/src/sys/arch/amd64/compile/obj/GENERIC/./x86/cpufunc.h:174
/root/workspace/src/sys/arch/amd64/compile/obj/GENERIC/./x86/cpufunc.h:174
/root/workspace/src/sys/arch/amd64/amd64/trap.c:517
...
/root/workspace/src/sys/arch/amd64/amd64/trap.c:549
/root/workspace/src/sys/uvm/uvm_fault.c:817
/root/workspace/src/sys/arch/amd64/compile/obj/GENERIC/./machine/cpu.h:58
/root/workspace/src/sys/arch/amd64/compile/obj/GENERIC/./machine/cpu.h:59
/root/workspace/src/sys/uvm/uvm_fault.c:861
...

So the example above may explain a little bit why we have too long stack traces, however, there are still some things that we would like to understand, regarding the format and its size that we need to store.

Second part (Coverage format):

We had an internal discussion about the potential size of the buffer to store traces inside the kcov.

One factor why we wanted to make it larger was the trace from sync(2) which was huge and contained a lot of VFS code related to the vnodes operations. That also was the specific case as machine contained a lot of files in FS mainly because it had sources for building the kernel, and VFS code iterated over vnodes.
- The second factor is the fact that Linux based kcov allows the user to allocate memory in order of GB using vma (see code below)

  static int kcov_ioctl_locked(struct kcov *kcov, unsigned int cmd,
/*...*/

    switch (cmd) {
    case KCOV_INIT_TRACE:
/*...*/
        /* Size must be at least 2 to hold current position and one PC.
         * Later we allocate size * sizeof(unsigned long) memory,
         * that must not overflow. */
        size = arg;
        if (size < 2 || size > INT_MAX / sizeof(unsigned long))
            return -EINVAL;
        kcov->size = size;

Now let's forget about the Linux implementation for the moment, and focus on the format (which mainly affect the size). Based on the previous discussion I am trying to understand if we should keep the full PC inside our buffer (we may reduce it a little bit if we find out that we have some issue in our code). Also, it is not that difficult to create in parallel AFL specific code can do the bit trick as Quentin did:

        ++area[(kcov->prev_location ^ location) & kcov->mask];
        kcov->prev_location = hash_long(location, BITS_PER_LONG);

Another interesting option would be as you mentioned to store unique PC1->PC2 pairs. That seems attractive from the memory usage and performance perspective.

Inside Filesystem code, we have some operations that loop over some structures or disk blocks, so by design stack traces (or shadow of IP branches execution to be more precisely) might be significant, even if we fix the UVM information.

Here is an example to illustrate the problem: let's assume a simple guessing game implemented as kernel function that read input (i.e., 6 letter+) from UserSpace and compares it with a pattern. Because it is too hard we want to help a user so we will give few chances to guess one byte:

static int
guessing_game_read(dev_t self, struct uio *uio, int flags)
{
    /* Move the string from user to kernel space and store it locally and do more stuff* /
    ...
    uiomove(buffer, len, uio);

    if (printlen > 6) {
        /* We help user 100 times for first letter to get it right */
        for (int i = 0; i < 100; ++i) {
                if (buffer[0]+i != 'L' || buffer[1] != 'o' || buffer[2] != 't' ||
                    buffer[3] != 't' || buffer[4] != 'e' || buffer[5] != 'r' ||
                    buffer[6] != 'y') {
                if (i == 99)
                    printf("#: I'm sorry you lost...\n");
            }
            else {
                printf("#: YOU won the GAME!! \n");
                        }
        }
    }

    return 0;
}

Tracing the game function will trigger a lot of duplicates as we help the user 100 times with the first letter. Now that means Trace will be 100 times longer as it may contain i.e., 99 repetitions and one interesting branch. If we switch from raw IP traces to the branches pairs PC1->PC2 then we may reduce duplicates and focus only on the uniques which may mean in our example 100 times smaller information, but Fuzzer will still have full knowledge that is needed. Is the above statement correct? (sorry for a little bit long divagation).

Also, it is essential for us that we want to be compatible with not just only AFL, but other Fuzzers: namely speaking honggfuzz support is on our map.

dvyukov commented 5 years ago

Re too much coverage, for linux we disable instrumentation of some files that (1) give too much uninteresting coverage or (2) give flaky coverage (mutexes, etc, coverage there they is almost never a function of syscall arguments). But there is an obvious trade-off between blacklisting too much and getting enough coverage. I don't know if there are some obvious opportunities here for netbsd, but something to check. Because that may be a low hanging fruit. Also re UMV (that's page faults, right?) I am thinking if we could pre-fault more memory ahead of time? It should be mostly pre-faulted by writing input arguments, but still there may be more opportunities.

dvyukov commented 5 years ago

Also, it is not that difficult to create in parallel AFL specific code can do the bit trick as Quentin did: ... Another interesting option would be as you mentioned to store unique PC1->PC2 pairs.

I am not completely following. What I meant for unique PC pairs, is exactly the same as the Quentin code does (and what syzkaller does). So that's not another option, that's the same option, right?

That seems attractive from the memory usage and performance perspective.

Also here. This looks orthogonal to memory usage. We may dedup or not PC pairs, and we may equally dedup or not single PC's.

If we switch from raw IP traces to the branches pairs PC1->PC2 then we may reduce duplicates and focus only on the uniques which may mean in our example 100 times smaller information, but Fuzzer will still have full knowledge that is needed. Is the above statement correct?

Doing pairs or single PCs and deduping look like orthogonal things to me.

krytarowski commented 5 years ago

Regarding UVM. I'm not completely sure here with my statement, but as far as I can tell UVM was designed to be as lazy as possible and we fault in the last possible moment, so it tracks UVM trap code in the trace because of this design choice.

gotoco commented 5 years ago

@dvyukov To clarify:

We currently store raw PC. I saw your previous comment

 For main fuzzing syzkaller dedups the trace and is only interested in unique PC pairs at the moment.
 We also have some tools that can dump the trace as is, which is sometimes useful for debugging 
(e.g. you can figure out the exact execution path in the kernel)

So I thought, maybe storing unique PC pairs instead of the large trace by default will be a better idea:

it will follow your format
Less memory, faster code

gotoco commented 5 years ago

I am just trying to understand if storing full trace (list of all sequential branches PCs) have any benefit (except debugging case) over just storing the pairs of incomming unique PCs.

dvyukov commented 5 years ago

Regarding UVM. I'm not completely sure here with my statement, but as far as I can tell UVM was designed to be as lazy as possible and we fault in the last possible moment, so it tracks UVM trap code in the trace because of this design choice.

Yes, but if we write to this page from user-space before the syscall, it should not be faulted during the syscall then? We should write-touch most of the pages already when we write input arguments, so maybe it's not a big deal.

dvyukov commented 5 years ago

I am just trying to understand if storing full trace (list of all sequential branches PCs) have any benefit (except debugging case) over just storing the pairs of incomming unique PCs.

Does have now? Or may have in the future? :) It does not have any benefit at the moment (syzkaller does the same anyway, it does not matter where we do this processing -- in the kernel or in user-space). May it have benefit in the future? It may. We may want to use afl-style counters, or the other options I mentioned (stack trace, caller-callee, stack depth, etc). But there are no immediate plans for that. And what exactly coverage mode is the best for kernel is an open research question. Also as I mentioned, if we do a new KCOV mode that captures what syzkaller does now, it does not force us to use this mode forever. If we will want to try a new coverage mode, we may switch back to tracing mode again. And when we settle with the new coverage mode, we can add yet another KCOV mode. So in this sense we do not burn bridges. The only cost if leaving some unused legacy kernel code behind in the form of KCOV coverage modes.

krytarowski commented 5 years ago

The only cost if leaving some unused legacy kernel code behind in the form of KCOV coverage modes.

We still can gain performance (but hard to tell exact numbers without benchmarks). We can generate trace in the target format directly and we will remove overhead for recoding the format and large buffer transfers.

The concerns about KCOV API design are real and that's why we prompt for help.

As it's rather difficult to make upfront design for all the APIs without technical debt for all the needs and not leave some cruft behind us.... I think we can try to change our aim and instead of mutating KCOV for certain fuzzers, we can try to have an option to insert kernel modules with extensions that will change the default behavior for our needs and keep these modules out of the src/.

vegard commented 5 years ago

The AFL-style hashtable was meant to be implemented from day one, and Quentin even mailed patches, but it was never upstreamed: https://lkml.org/lkml/2016/11/16/668

Yes, this patch implements AFL-style tracing. The crucial bit is:

+       ++area[(t->kcov_prev_location ^ location) & t->kcov_mask];
+       t->kcov_prev_location = hash_long(location, BITS_PER_LONG);

which basically increments a counter associated with the (prev IP, IP) pair. For the equivalent function for regular AFL, see https://github.com/mirrorer/afl/blob/master/docs/technical_details.txt#L30:

  cur_location = <COMPILE_TIME_RANDOM>;
  shared_mem[cur_location ^ prev_location]++; 
  prev_location = cur_location >> 1;

(so it is not in fact 100% the same, but similar).

As far as I remember, the patch linked was very close to what we actually ended up using for the presentation, and I think it should not be too difficult to rebase it on the latest kernel. We ought to try to upstream it again... maybe we just didn't hit the right person.

Cc @casasnovas

krytarowski commented 5 years ago

https://blog.netbsd.org/tnf/entry/write_your_own_fuzzer_for

casasnovas commented 5 years ago

Hi Michael,

Sorry for the late answer, and thanks for sharing your blog post, interesting read!

If that's still of interest, the branch we were using to fuzz the Linux kernel with AFL can be found at :

https://github.com/casasnovas/afl/commit/5bb409ba0bc6f0739beac889c8160c24aa3b20ef

Nice work! Q

On Sun, Jun 30, 2019 at 1:28 AM mgrochowski notifications@github.com wrote:

@vegard https://github.com/vegard @casasnovas https://github.com/casasnovas -> Is the modified by you guys AFL repository available somewhere online?

We were thinking to try our 'new' interface to works in the same way as your Linux work.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/google/honggfuzz/issues/225?email_source=notifications&email_token=AAASTZZFEKAWCJRH2OHLXBDP47VZHA5CNFSM4FQ57GAKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODY4B5TQ#issuecomment-506994382, or mute the thread https://github.com/notifications/unsubscribe-auth/AAASTZZXWNNZD32QNBEILU3P47VZHANCNFSM4FQ57GAA .

-- Quentin Casasnovas

krytarowski commented 5 years ago

Fuzzing NetBSD Filesystems via AFL. [Part 2] http://blog.netbsd.org/tnf/entry/fuzzing_netbsd_filesystems_via_afl

I used honggfuzz for rumpkernel fuzzing:

Rumpkernel assisted fuzzing of the NetBSD file system kernel code in userland http://netbsd.org/~kamil/rump/rump_pub_etfs_register_buffer.c

robertswiecki commented 3 years ago

Spring cleaning, please re-open if needed

google / honggfuzz

kcov(4) assisted fuzzing kernel #225

First part long stack traces (or PC shadow trace)

Second part (Coverage format):