google / syzkaller

syzkaller is an unsupervised coverage-guided kernel fuzzer
Apache License 2.0
5.21k stars 1.2k forks source link

pkg/csource: better reproducers #1070

Open dvyukov opened 5 years ago

dvyukov commented 5 years ago

Some suggestions from the mailing list by Eric and Tetsuo:

- Replace bare system calls with proper C library calls.  For example: 

        #include <sys/syscall.h> 

        syscall(__NR_socket, 0xa, 6, 0); 

    becomes: 

        #include <sys/socket.h> 

        socket(AF_INET, SOCK_DCCP, 0); 

- Do the same for structs.  Use the appropriate C header rather than filling in 
  each struct manually.  For example: 

        *(uint16_t*)0x20000000 = 0xa; 
        *(uint16_t*)0x20000002 = htobe16(0x4e20); 
        *(uint32_t*)0x20000004 = 0; 
        *(uint8_t*)0x20000008 = 0; 
        *(uint8_t*)0x20000009 = 0; 
        *(uint8_t*)0x2000000a = 0; 
        *(uint8_t*)0x2000000b = 0; 
        *(uint8_t*)0x2000000c = 0; 
        *(uint8_t*)0x2000000d = 0; 
        *(uint8_t*)0x2000000e = 0; 
        *(uint8_t*)0x2000000f = 0; 
        *(uint8_t*)0x20000010 = 0; 
        *(uint8_t*)0x20000011 = 0; 
        *(uint8_t*)0x20000012 = 0; 
        *(uint8_t*)0x20000013 = 0; 
        *(uint8_t*)0x20000014 = 0; 
        *(uint8_t*)0x20000015 = 0; 
        *(uint8_t*)0x20000016 = 0; 
        *(uint8_t*)0x20000017 = 0; 
        *(uint32_t*)0x20000018 = 0; 

    becomes: 

        struct sockaddr_in6 addr = { .sin6_family = AF_INET6, .sin6_port = htobe16(0x4e20) }; 

- Put arguments on the stack rather than in a mmap'd region, if possible. 

- Simplify any calls to the helper functions that syzkaller emits, e.g. 
  syz_open_dev(), syz_kvm_setup_vcpu(), or the networking setup stuff.  Usually 
  the reproducer needs a small subset of the functionality to work. 

- For multithreaded reproducers, try to incrementally simplify the threading 
  strategy.  For example, reduce the number of threads by combining operations. 
  Also try running the operations in loops.  Also, using fork() can often result 
  in a simpler reproducer than pthreads. 

- Instead of using the 'r[]' array to hold all integer return values, give them 
  appropriate names. 

- Remove duplicate #includes. 

- Considering the actual kernel code and the bug, if possible find a different 
  way to trigger the same bug that's simpler or more reliable.  If the problem 
  is obvious it may be possible to jump right to this step from the beginning. 

Some gotchas: 

- fault-nth injections are fragile, since the number of memory allocations in a 
  particular system call varies by kernel config and kernel version. 
  Incrementing n starting from 1 is more reliable. 

- Some of the perf_event_open() reproducers are fragile because they hardcode a 
  trace event ID, which can change in every kernel version.  Reading the trace 
  event ID from /sys/kernel/debug/tracing/events/ is more reliable. 

- Reproducers using the KVM API sometimes only work on certain processors (e.g. 
  Intel but not AMD) or even depend on the host kernel. 

- Reproducers that access the local filesystem sometimes assume that it's ext4. 
Yes. I'm doing similar things. Other things not listed here are:

  Try to remove syscall() which passes EOF as fd argument, for it should be
  unrelated to the problem unless such call affects subtle timing.

  Try to remove code for testing fuse / tun etc. if the problem seems to be
  unrelated to fuse / tun etc.

syzbot gets pleased with finding one C reproducer, but I wish that syzbot
continues trying to find smaller C reproducers by e.g. eliminating unrelated
calls.

> 
> - Replace bare system calls with proper C library calls.  For example:
> 
>       #include <sys/syscall.h>
> 
>       syscall(__NR_socket, 0xa, 6, 0);
> 
>     becomes:
> 
>       #include <sys/socket.h>
> 
>         socket(AF_INET, SOCK_DCCP, 0); 

Yes. It would be nice if C reproducers are provided using symbols. I run
syzbot provided C reproducers under strace because strace gives me more hints
about symbols and structures.
dvyukov commented 3 years ago

A related issue #197

a-nogikh commented 2 years ago

Some more repro quality complaints:

I'll note that some of the unnecessary garbage, such as setting up and tearing
down a wireguard tunnel, or some other irrelevant cgroup setup, seems to be
templated code.   Why can't it be well-commented code which uses real variables
and structures, as opposed to filling in a char * array and passing a pointer
to an ioctl using a number as opposed to a symbolic name?     Since it's
auto-generated code, surely it wouldn't be that hard to have the template be in
human-readable C as opposed to only-a-step-above-machine-language C code?
a-nogikh commented 2 years ago

It also looks like our minimization does not always reliably remove unnecessary calls.

I have very often removed things such as a wireguard networking tunnel
setup/teardown when the stack trace was in ext4, and the bug turned out to be
not some kind of race or locking deadlock, but just a badly corrupted metadata
block for which the file system wasn't sufficiently paranoid.    Why couldn't
automation have figured this out for me?

Could it be related to the cases when the repro is not 100% reliable? Even if it's 90% reliable, it could already prevent normal minimization from happening.

Maybe repeat each test run during minimization several times and go forward if it's crashed at least once?

dvyukov commented 2 years ago

We remove the network device setup in very coarse-grained way (either remove all of it, or leave all of it). Likely 1 bit of it was needed and we kept all of it. Currently it's quite hard to make it finer-grained (also non-scalable and each new piece worsens the code). I've attempted to rework features support in syzkaller, but never managed to finish: https://github.com/dvyukov/syzkaller/commits/dvyukov-features-prepare https://github.com/dvyukov/syzkaller/commit/fc4b8fb553764bcbaabfbca4fbff806d4c558b12

dvyukov commented 1 year ago

cc @FlorentRevest

dvyukov commented 1 year ago

Other possible improvements for C reproducers:

  1. Move test syscalls to the top of the reproducer (may require some forward declarations). The syscalls is the varying part, these are usually more interesting than large static template parts.

  2. Move large binary blobs out-of-line. Namely, instead of memcpy(0x..., huge binary blob here), do something like:

const char image1[123456];
...
memcpy(0x..., image1, sizeof(image1));
...
// at the bottom of the file:
const char image1[] = "...huge binary blob here...";