Make it easier to fuzz goroutine orderings

egonelbre commented 5 years ago

Currently finding logical races can be difficult, having something to fuzz different orderings in goroutine code could help.

There is a recent paper https://lifeasageek.github.io/papers/jeong-razzer.pdf to show one of such approaches.

josharian commented 5 years ago

For data races, this seems thoroughly in the domain of the race detector. And when the race detector is enabled, it causes extra randomization in the scheduler to try to flush out ordering bugs.

The race detector doesn't help with logical races (although at a glance, the paper appears to be about data races). However, go-fuzz it very much geared away from concurrent code; the first thing is does is set GOMAXPROCS=1. I suspect that making go-fuzz do this kind of work well would require some significant re-architecting.

On the other hand, it might be nice to have a sloppier mode in which go-fuzz doesn't collect coverage information, doesn't use sonar, doesn't set GOMAXPROCS=1, but does build with -race. Then you could use an existing corpus and an existing fuzz function also with go-fuzz's runner and mutation generation to try to flush out data races. This probably isn't too hard, but it does muddy the waters a bit...it's using go-fuzz more like quickcheck than as a fuzzer.

I'd be curious to hear @dvyukov's thoughts, and perhaps @everestmz as well.

dvyukov commented 5 years ago

-race is useful with GOMAXPROCS=1 as it uses abstract relations to detect races. So as the first step we could make -race supported in go-fuzz-build. Is it just passing the flag around?

As you noted runtime scheduler does some order randomization when race detector is enabled. That should also be useful with GOMAXPROCS=1.

To better provoke races runtime (or race runtime) should do more order randomization. That would be useful with go-fuzz and without go-fuzz. Fortunately for us theory says that we need preemption/rescheduling points only before synchronization actions (channel operations, Mutex methods, atomic operations). If rescheduling at non-synchronization actions makes any difference for the program, then this also should be detectable as a race by race detector. Hooking into synchronization actions should be doable, we already have some race hooks in all of them anyway (even in atomic ops). I only not sure if this should go into runtime (easier to do on the Go side initially) or into race runtime (will allow to reuse the same logic for all other languages as well). This is outside of context of go-fuzz. And should work out of the box.

josharian commented 5 years ago

I’ve added -race to go-fuzz-build. There’s no clear path for building more in go-fuzz, so I’m going to close this. @egonelbre, I know you know your way around the Go core, if you want to investigate doing more scheduler randomization in -race mode or just file an issue about it there.

thepudds commented 5 years ago

Very nice.

@egonelbre One quick comment -- to get alternate orderings at the application level, github.com/mschoch/smat ("State Machine Assisted Testing") is a nice small package that layers on top of go-fuzz to find interesting sequences of operations driven by coverage. You define a set of operations and probabilities, and smat translates from go-fuzz into a sequence of operations that get executed (e.g., see the example in the repo for using smat on bolt DB). This is not an entirely unique approach, but it is there for inspection and use if interested.

I believe you personally are very aware of that general technique, and good chance you are also aware of that particular package as well, but I mention it here in case (a) someone else stumbles across this issue, and (b) given -race support landed for go-fuzz about 7 hours ago, if you have an interesting problem at hand you likely could be the first person in the world to try smat with go-fuzz-build -race to see if it is a useful combination (or not).

In any event, I know that is not exactly what you had suggested here, but wanted to mention in case useful for anyone else reading this...

thepudds commented 4 years ago

Sorry, I know this is a closed issue, but I am curious if anyone has a working example of a Fuzz function that triggers a race detection with -race specified for go-fuzz-build.

I've tried a few simple examples, but have not been able to see it work yet. Might just be user error on my part, though.

Here is an example that does not trigger a race report by go-fuzz-build -race && go-fuzz:

func FuzzRace(data []byte) int {
        var shared int
        go func() { shared++ }()
        go func() { shared++ }()
        return shared
}

On the other hand, if you add a main and run it with go run -race ., it does trigger a race report:

func main() { FuzzRace(nil) }

thepudds commented 4 years ago

This might be related to GOMAXPROCS if go-fuzz is setting GOMAXPROCS=1.

Using that same example from the prior comment with a main func added, this does not trigger a race report, even if run in a bash loop for 100 iterations:

GOMAXPROCS=1 go run -race .

On the other hand, this reports a race immediately:

GOMAXPROCS=2 go run -race .

dvyukov commented 4 years ago

We could detect -race build and avoid setting GOMAXPROCS=1. This sounds reasonable because if one uses -race, they are interested in races, and if we don't detect races with GOMAXPROCS=1 it all becomes pointless. So even if not setting GOMAXPROCS makes some other things worse, it's still useful. However, I am sure some users will always use -race build assuming it's just "better", which is probably suboptimal. We could mention this in some docs.

Another alternative: not mess with GOMAXPROCS if it's already set in env. However, this won't fix -race by itself, so people will assume that are testing for races but they still don't. Maybe we could do both things? But that will penalize users who happen to have GOMAXPROCS exported always for some historical reasons.

thepudds commented 4 years ago

Maybe we could do both things?

That makes sense to me as an approach, or at least something we could try. If it works, then hopefully the documentation could point most people in the right direction, and/or maybe consider also emitting a short message describing what it is doing when running if the user has GOMAXPROCS=1 set after building with -race (to help educate people who might not read the documentation).

Another variation could be to error out if built with -race and then run with GOMAXPROCS=1 in the user's environment.

Would it make sense to re-open this bug, at least for discussion? Or I could open a new bug?

dvyukov / go-fuzz

Make it easier to fuzz goroutine orderings #231