testing: minimized crasher is unrelated to original

stevenjohnstone commented 3 years ago

What version of Go are you using (`go version`)?

$ go version
go version devel go1.18-7c648e2acb3 Thu Sep 9 17:28:03 2021 +0000 linux/amd64

Does this issue reproduce with the latest release?

n/a

What operating system and processor architecture are you using (`go env`)?

go env Output

$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/stevie/.cache/go-build"
GOENV="/home/stevie/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/stevie/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/stevie/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/home/stevie/sdk/gotip"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/home/stevie/sdk/gotip/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="devel go1.18-7c648e2acb3 Thu Sep 9 17:28:03 2021 +0000"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/stevie/minimize/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build2657391484=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Here's the simplest case I can find which reproduces the problem:

package minimize

import (
    "fmt"
    "testing"
)

//go:noinline
func branch1(i int) {
    b := []byte{1, 2, 3}
    fmt.Printf("%d\n", b[4])
}

//go:noinline
func branch2(i int) {
    panic(i)
}

func FuzzMinimizeToWrongCrash(f *testing.F) {
    f.Add(-20000) // this will force the fuzzer to crash in the the first branch
    f.Fuzz(func(t *testing.T, b int) {
        if b < -20000 {
            branch1(b)
            return
        }

        if b > 10000 {
            // the corpus will contain a crasher for this branch
            branch2(b)
            return
        }
    })
}

The test is rigged so it'll crash in branch1. Minimization kicks in and it ends up crashing in branch2:

$ ./minimize.test -test.fuzz=Fuzz -test.fuzzcachedir=./cache
found a crash, minimizing...
fuzzing, elapsed: 0.0s, execs: 85 (7363/sec), workers: 8, interesting: 0
--- FAIL: FuzzMinimizeToWrongCrash (0.01s)
        panic: %!s(int=18446)
        goroutine 57 [running]:
        runtime/debug.Stack()
            /home/stevie/sdk/gotip/src/runtime/debug/stack.go:24 +0x90
        testing.tRunner.func1.2({0x58c480, 0xc000118a68})
            /home/stevie/sdk/gotip/src/testing/testing.go:1288 +0x265
        testing.tRunner.func1()
            /home/stevie/sdk/gotip/src/testing/testing.go:1295 +0x225
        panic({0x58c480, 0xc000118a68})
            /home/stevie/sdk/gotip/src/runtime/panic.go:814 +0x207
        minimize.branch2(0x0)
            /home/stevie/minimize/fuzz_test.go:16 +0x37
        minimize.FuzzMinimizeToWrongCrash.func1(0x0, 0x480e)
            /home/stevie/minimize/fuzz_test.go:29 +0x86
        reflect.Value.call({0x58e720, 0x5c7c78, 0x13}, {0x5ba083, 0x4}, {0xc000116b10, 0x2, 0x2})
            /home/stevie/sdk/gotip/src/reflect/value.go:542 +0x814
        reflect.Value.Call({0x58e720, 0x5c7c78, 0xc000113a00}, {0xc000116b10, 0x2, 0x2})
            /home/stevie/sdk/gotip/src/reflect/value.go:338 +0xc5
        testing.(*F).Fuzz.func1.1(0x0)
            /home/stevie/sdk/gotip/src/testing/fuzz.go:389 +0x1c6
        testing.tRunner(0xc000194680, 0xc00010c980)
            /home/stevie/sdk/gotip/src/testing/testing.go:1342 +0x102
        created by testing.(*F).Fuzz.func1
            /home/stevie/sdk/gotip/src/testing/fuzz.go:378 +0x4e5

        --- FAIL: FuzzMinimizeToWrongCrash (0.00s)

    Crash written to testdata/corpus/FuzzMinimizeToWrongCrash/7219f755b8af33d79397c74214a3126d25f0664bbdac3dcd511325de4cc54063
    To re-run:
    go test minimize -run=FuzzMinimizeToWrongCrash/7219f755b8af33d79397c74214a3126d25f0664bbdac3dcd511325de4cc54063
FAIL
$ cat testdata/corpus/FuzzMinimizeToWrongCrash/7219f755b8af33d79397c74214a3126d25f0664bbdac3dcd511325de4cc54063 
go test fuzz v1
int(18446)

Instrumenting with this bpftrace script

uprobe:./minimize.test:"minimize.branch1" {
  printf("%d: branch 1 taken with value %d\n", tid, reg("ax"));
}
uprobe:./minimize.test:"minimize.branch2" {
  printf("%d: branch 2 taken with value %d\n", tid, reg("ax"));
}

I see that all the worker processes run branch1 first:

81977: branch 1 taken with value -20123
81965: branch 1 taken with value -20052
81959: branch 1 taken with value -20093
81952: branch 1 taken with value -20004
81959: branch 1 taken with value -20093
81959: branch 1 taken with value -20093
81959: branch 2 taken with value -1717988928
81959: branch 2 taken with value -171798893
81959: branch 2 taken with value 1271310299
81959: branch 2 taken with value -1161359159
81959: branch 2 taken with value -1404626105
81959: branch 2 taken with value -140462611
81959: branch 2 taken with value 2133437386
81959: branch 2 taken with value -216152991
81959: branch 2 taken with value 1266874889
81959: branch 2 taken with value 1844674407
81959: branch 2 taken with value 184467440
81959: branch 2 taken with value 18446744
81959: branch 2 taken with value 1844674
81959: branch 2 taken with value 184467
81959: branch 2 taken with value 18446
81968: branch 1 taken with value -20002
81989: branch 1 taken with value -20031
81948: branch 1 taken with value -20049

The tail of the log shows what looks like the negative value -20123 being cast to a large uint and then divided by 10 repeatedly to reach the value 18446 in the corpus. This may be another issue if the intention was to preserve the sign of integers on minimization :shrug:

What did you expect to see?

I expected to see the original crash recorded. The first and second crashes have different coverage and different panic messages/backtraces. There's enough information to label these as different crashes.

In a more complicated example, reducing one type of crash to another and then discarding the original wastes a lot of work. The fuzzer would need to be run again to find the original crash. It'd be really useful if the fuzzer retained the original crasher and provided a minimized version as a feature.

What did you see instead?

The original crasher is lost.

thanm commented 3 years ago

@jayconrod @katiehockman

toothrot commented 3 years ago

Checking in on this issue as it's labeled a release blocker for Go 1.18. Is there any update?

jayconrod commented 3 years ago

Not yet. There are a lot of fuzzing issues. We'll get to them as soon as we can.

katiehockman commented 3 years ago

I'm going to remove the release-blocker label for this issue. I think it's a nice to have, so I'll leave it for Go 1.18, but not strictly necessary for the release. We don't currently support any detection around what kind of crash occurred in order to replicate it during minimization. This is something that we'll need down the line: namely for deduplication of crashing so that we can run in a -fuzzcontinue mode that doesn't stop after the first crash. In the meantime, it is still reporting a crash, even if it isn't the first one found. Once that crash is fixed, then the fuzzer should be able to find the other one as well.

stevenjohnstone commented 2 years ago

@katiehockman I think the problem here is that minimization is considered successful if the new value has at least one coverage bit in common with the original. This can lead to two inputs with little in common e.g for this code

if len(input) >= 1 {
    if input[0] == 'A' {
          foo(input);
    } else {
          bar(input);
    }
}

[]byte{'B'} could be considered a minimization of []byte{'A', 1, 2, 3} because they both make it into the enclosing if body and so have a coverage bit in common.

I think a minimized input should never result in fewer coverage bits than we had for the original input. I experimented with enforcing this and the following patch appears to fix the bug:

diff --git a/src/internal/fuzz/coverage.go b/src/internal/fuzz/coverage.go
index 3dee73b81c..88f98a16b2 100644
--- a/src/internal/fuzz/coverage.go
+++ b/src/internal/fuzz/coverage.go
@@ -66,6 +66,17 @@ func countNewCoverageBits(base, snapshot []byte) int {
        return n
 }

+// isCoverageSubset returns true if all the base coverage bits are set in
+// snapshot
+func isCoverageSubset(base, snapshot []byte) bool {
+       for i, v := range base {
+               if v&snapshot[i] != v {
+                       return false
+               }
+       }
+       return true
+}
+
 // hasCoverageBit returns true if snapshot has at least one bit set that is
 // also set in base.
 func hasCoverageBit(base, snapshot []byte) bool {
diff --git a/src/internal/fuzz/worker.go b/src/internal/fuzz/worker.go
index e7d824bea1..b80e98bde7 100644
--- a/src/internal/fuzz/worker.go
+++ b/src/internal/fuzz/worker.go
@@ -908,7 +908,7 @@ func (ws *workerServer) minimizeInput(ctx context.Context, vals []interface{}, c
                        }
                        return true
                }
-               if keepCoverage != nil && hasCoverageBit(keepCoverage, coverageSnapshot) {
+               if keepCoverage != nil && isCoverageSubset(keepCoverage, coverageSnapshot) {
                        return true
                }

Given recent improvements in the effectiveness of the mutator by @rolandshoemaker I've been able to see that this problem with minimization actually works against the mutator. For example, this problem renders the mutator ineffective for examples which libfuzzer etc deal with easily just using coverage feedback e.g. the fuzzer can't find an input which terminates the following

var leet = []byte{1, 3, 3, 7}

func magic(input []byte) bool {
    if len(input) >= 4 && input[0] == leet[0] && input[1] == leet[1] && input[2] == leet[2] && input[3] == leet[3] {
        input = input[4:]
        return len(input) >= 4 && input[0] == leet[0] && input[1] == leet[1] && input[2] == leet[2] && input[3] == leet[3]
    }
    return false
}

func Fuzz(f *testing.F) {
    f.Fuzz(func(t *testing.T, data []byte) {
        if magic(data) {
            t.Fatalf("magic is %v", data)
        }
    })
}

With the above patch applied, the fuzzer finds a crasher right away. My interpretation is that the mutator is working to increase coverage but minimization can reduce coverage.

katiehockman commented 2 years ago

Thanks @stevenjohnstone for looking into this so deeply. One of us should be able to take a look at this in the next couple weeks as well, and hopefully can improve this before 1.18. If not, it'll happen for 1.19.

gopherbot commented 2 years ago

Change https://go.dev/cl/391454 mentions this issue: internal/fuzz: minimization should not reduce coverage

gopherbot commented 2 years ago

Change https://go.dev/cl/391614 mentions this issue: internal/fuzz: don't use dirty coverage maps during minimization

gopherbot commented 2 years ago

Change https://go.dev/cl/391797 mentions this issue: [release-branch.go1.18] internal/fuzz: don't use dirty coverage maps during minimization

gopherbot commented 2 years ago

Change https://go.dev/cl/391798 mentions this issue: [release-branch.go1.18] internal/fuzz: minimization should not reduce coverage

ianlancetaylor commented 2 years ago

CC @golang/fuzzing

Looks like this didn't make 1.19. Moving to backlog. Please recategorize as appropriate. Thanks.

golang / go