proposal: cmd/go: make fuzzing a first class citizen, like tests or benchmarks

bradfitz commented 7 years ago

Filing a proposal on behalf of @kcc and @dvyukov:

They request that cmd/go support fuzzing natively, just like it does tests and benchmarks and race detection today.

https://github.com/dvyukov/go-fuzz exists but it's not as easy as writing tests and benchmarks and running "go test -race" today.

Should we make this easier?

Motivation Proposal

ianlancetaylor commented 7 years ago

I think it would be easier to evaluate the idea if it were slightly less abstract.

For example:

_test.go are permitted to contain functions of the form FuzzXxx(f *testing.F, data []byte)
these functions are expected to run some test based on the random bytes in data
errors are reported using the testing.F argument in the usual way
f.Useful() may be called to indicate useful data, i.e., data that parses correctly
f.Discard() may be called to indicate that the data should be discarded
go test -fuzz=. runs the fuzz functions using a regexp like -test and -bench
naturally go test -fuzz must also rebuild the package in fuzz mode
the data is cached somewhere under $GOROOT/pkg, but where?

bradfitz commented 7 years ago

@ianlancetaylor, yes, FuzzXxx(f *testing.F, ...) is what this is about. The exact API is probably TBD.

I think the first step before it's designed completely is to determine whether there's interest.

ianlancetaylor commented 7 years ago

As a general concept, I'm in favor.

dsnet commented 7 years ago

I would expect that there would be an additional required flag (when fuzzing) where you specify the corpus directory.

ianlancetaylor commented 7 years ago

Can we just cache the corpus somewhere under $GOROOT/pkg? Are there cases where a typical user would be expected to modify the corpus themselves?

dsnet commented 7 years ago

I think it's wrong to think of the corpus as strictly a cache. The corpus is the save state of the fuzzer and the documentation for go-fuzz even recommends committing them into a version control system. The pkg directory is treated strictly as cache and it is not uncommon for people to recommend clearing out the directory, which will unfortunately delete the fuzzer state.

A specified corpus is not so much for the user modify the corpus themselves, but for them to specify how to persist the corpus data.

jimmyfrasche commented 7 years ago

Could there be some default convention say a _fuzz/xxx directory (where xxx corresponds with FuzzXxx) and a method on the *testing.F object to load a different corpus from the _fuzz/ directory if necessary? It seems like it should just know where the corpus is.

minux commented 7 years ago

I'm in favor. Efficient fuzzing usually requires some help from compiler so it's better to built this into std. (compiler instrumentation will be more efficient than go-fuzz's source code instrumentation. I also want to have cmd/cover built on compiler instrumentation to support branch coverage, but that's off-topic to this issue.)

How about add some methods to testing.T (or perhaps invent a new testing.Fuzz to replace testing.T, but see below) in fuzz tests? One of the method could be setting the corpus location (we can recommend it to be saved under testdata). but we probably should also a command line flag to override the test's setting (one compromise is to make both optional: Introduce -test.fuzzdir to hold the corpus path for all fuzz tests. If not provided, default to testdata/fuzz (*testing.T).FuzzDir("parser") // optional call to set corpus directory prefix location for this test, if relative, then relative to the -test.fuzzdir value.

To make the feature more useful, I suggest we still use testing.T so that it's quite easy to migrate fuzz found test cases into a (table driven) regular test. Making fuzz tests taking a testing.T will facilitate this.

cznic commented 7 years ago

Quoting @dvyukov

I would appreciate if you drop a line there if you found fuzzing useful and a brief of your success story.

It was very useful for me - found bugs in several lexers.

mvdan commented 7 years ago

I use it regularly on a lexer/parser/formatter for Bash (https://github.com/mvdan/sh).

Having it be a first-class citizen would simplify things for me and for contributors.

dsnet commented 7 years ago

Found a bug in the C decoder for google/brotli by fuzzing a Go implementation of a Brotli decoder.

Also found some divergences in Go bzip2 decoders from the canonical C decoder (this and #18516). All by fuzzing.

fatih commented 7 years ago

My coworker at DigitalOcean was working on a side project to make fuzzing easier. Check his repo out here: https://github.com/tam7t/cautious-pancake Adding it here as I think it would be a valuable piece of information for this discussion.

dgryski commented 7 years ago

The README for go-fuzz lists a number of "Trophies", ( https://github.com/dvyukov/go-fuzz#trophies ) the majority of which are from the standard library, but about 20% of which are external to the Go standard libraries.

A GitHub search for Go source files with the gofuzz build tag gives ~2500 results: https://github.com/search?l=Go&q=gofuzz&type=Code&utf8=%E2%9C%93

My tutorial on fuzzing ( https://medium.com/@dgryski/go-fuzz-github-com-arolek-ase-3c74d5a3150c ) gets 50-60 "reads" per month (according to medium's stats).

Kubuxu commented 7 years ago

Feature that would be also important (at least for me) would be ease of turning some selected Fuzz test cases into permanent tests. Simplest way to do it would be exporting the case data in go byte array and calling FuzzXXX function from TestXXX function but if FuzzXXX accepts *testing.F struct type it won't be possible.

DavidVorick commented 7 years ago

Yes, we've found fuzzing useful in our projects multiple times. Especially sensitive code, the fuzzer will frequently find edge cases that we missed. Encoding, networking, and generally things that depend on user input.

I will say that most of the benefit is usually seen in the first tiny bit of fuzzing. There's a pretty strong diminishing returns as you continue to fuzz, at least that's what we've found.

dvyukov commented 7 years ago

As you can understand, I am very supportive for this. Traditional testing is just not enough for modern development speeds. I am ready to dedicate some time to work on parts of this.

Throwing some ideas onto the table:

To flesh out the interface, we don't need to implement coverage nor any actual smartness. The interface should work if we just feed in completely random data, it will be just less useful. But I think it's the right first step. We can transparently add smartness later.
It would be nice to have some default location for corpus, because it will make onboarding easier. The location probably needs to be overridable with go test flag or GOFUZZ env var.
I think it's "must have" that fuzz funciton runs during normal testing. If corpus is present, each input from corpus is executed once. Plus we can run N random inputs.
Thinking how we can integrate continuous fuzzing into Go std lib testing (including corpus management) would be useful to ensure that it will also work for end users in their setups.
go command (or whatever runs fuzz function) might need some additional modes. For example, execute 1 given input, useful for crash debugging. Or, run all programs from corpus and dump code coverage report.
I am ready to give up on f.Useful() and f.Discard() for simplicity (as far as I understand that come from go-fuzz return values). They were never proven to be useful enough. For Discard Fuzz function can just return. And fuzzer can try to figure out Useful automatically.
In some cases Fuzz function needs more than just []byte. For example, regexp test needs the regular expression and a string to match. Other tests may need some additional int's and bool's. It's possible to manually split []byte into several strings and also take some bits as int's and bool's. But it's quite inconvenient and can negatively affect fuzzing efficiency (fuzzer can do better if it understands more about input structure). So we could consider allowing Fuzz function to accept a set of inputs with some restrictions on types, e.g. FuzzFoo(f *testing.F, s1, s2 string, x int, b bool). But this can be added later as backwards compatible extension. Just something to keep in mind.
An alternative interface could be along the following lines:
```
func FuzzFoo(f *testing.F) {
var data []byte
f.GetRandomData(&data)
// use data
}
```
GetRandomData must be called once and always with the same type. Since the function now does not accept the additional argument, we can make it a normal test:
```
func TestFoo(t *testing.T) {
var data []byte
testing.GetRandomData(&data)
// use data
}
```
This recalls testing/quick interface considerably, so maybe we could just use testing/quick for this. go tool will need to figure out that this is a fuzzing function based on the call to testing.GetRandomData.

dvyukov commented 7 years ago

I will say that most of the benefit is usually seen in the first tiny bit of fuzzing. There's a pretty strong diminishing returns as you continue to fuzz, at least that's what we've found.

That's true to some degree, but not completely. It depends on (1) complexity of your code, (2) rate of change of your code, (3) smartness of the fuzzer engine. If your code is simple and doesn't change, then fuzzer will find everything it can in minutes. However, if your code change often, you want to run fuzzing continuously as regression testing. If your code is large and complex and fuzzer is smart enough, then it can manage to uncover complex bugs only after significant time. One example is this bug in OpenSSL bignum asm implementation that we've found after several CPU years of fuzzing: https://github.com/google/fuzzer-test-suite/tree/master/openssl-1.1.0c Another example is our Linux kernel fuzzing which uncovers bugs at roughly constant rate over more than a year (due to complexity of the code and frequent changes): https://github.com/google/syzkaller/wiki/Found-Bugs

webRat commented 7 years ago

I'm fine with fuzzing, but the problem is that if you vendor in a library that fuzzes, then... you inherit all their corpus. So, I'm not a fan of corpus being checked into the project.

Case in point: screen shot 2017-02-16 at 9 35 58 am

Overall, I think fuzzing is a must have. Glad to see a proposal to make it easier.

btracey commented 7 years ago

To confirm @dvyukov in https://github.com/golang/go/issues/19109#issuecomment-280315445 , it would be really nice to have supported types other than []byte. We found bugs in both the gonum/blas implementation and the OpenBLAS library using fuzzing. It's possible to use go-fuzz, but it's kind of a pain to parse the []byte directly, (https://github.com/btracey/blasfuzz/blob/master/onevec/idamax.go).

kardianos commented 7 years ago

Suggest it goes under the subfolder testdata. Then any tools that ignore tests will also ignore this dir.

dsnet commented 7 years ago

@dvyukov

I think it's "must have" that fuzz funciton runs during normal testing. If corpus is present, each input from corpus is executed once. Plus we can run N random inputs.

I have concerns about how much time this is going to add to testing. My experience with fuzzing is that compiling with the fuzz instrumentation takes a significant amount of time. I'm not sure this is something we want to inflict upon every use of go test.

Kubuxu commented 7 years ago

@dsnet to execute corpus and check if it doesn't fail instrumentation isn't needed. Instrumentation is needed when you want to expend/improve the corpus.

CAFxX commented 7 years ago

Should there be a story to make it easy to use external fuzzing engines?

dsnet commented 7 years ago

@Kubuxu, I'm comfortable with running the Fuzz functions as a form of test without special instrumentation, but Dmitry comment suggested running with N random inputs, which implies having the instrumentation involved.

kcc commented 7 years ago

My 2c (I am utterly ignorant about Go, but have some ideas about fuzzing)

There are several major parts in coverage-guided fuzzing as I can see it:

instrumentation
interface
fuzzing engines' logic (how to mutate, choose elements to add to the corpus, etc)
integration with the rest of Go testing infra (I won't comment on this one -- no opinion)

Instrumentation is better to be done in the compiler, this way it's the most efficient and easy to use. In LLVM we have these two kinds of instrumentation used for guided fuzzing: https://clang.llvm.org/docs/SanitizerCoverage.html#tracing-pcs-with-guards (control flow feedback) https://clang.llvm.org/docs/SanitizerCoverage.html#tracing-data-flow (data flow feedback)

The interface must be as simple as possible. For C/C++ our interface (which we use with libFuzzer, AFL, hoggfuzz, and a few others) is:

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
  DoSomethingInterestingWithMyAPI(Data, Size);
  return 0;  // Non-zero return values are reserved for future use.
}

and the only thing I regret is that the return type is not void. IMHO, for the first version of the interface for Go fuzzing, the go-fuzz approach is perfect:

func Fuzz(data []byte) int

(again, not confident about int return value)

Fuzzing engines and the interface should be independent. It should be possible to plug any fuzzing engine (not necessary written in Go) to fuzz a Go fuzz target. Such fuzzing engine may need to understand the feedback provided by the instrumentation though. E.g. I'd love to try libFuzzer/AFL on the Go targets. And by fuzzing engine we should understand a wider class of tools, including e.g. concolic execution tools.

And, it would be nice to have the new fuzzing engine(s) to behave similar to AFL, libFuzzer, and go-fuzz so that they are easier to integrate with continuous fuzzing service(s) (e.g. oss-fuzz)

Should there be a story to make it easy to use external fuzzing engines?

Absolutely, see above.

it would be really nice to have supported types other than []byte.

Maybe. For the really complex data structures our current answer is to use protobufs as the input: https://github.com/google/libprotobuf-mutator There is also a middle ground where you need to fuzz e.g. a pair of strings. But I would rather have a standard adapter from from []byte into a pair of strings than to complicate the interface.

Are there cases where a typical user would be expected to modify the corpus themselves?

Corpus is not a constant. It evolves as long as the code under test changes, fuzzing techniques evolve, and simply more CPU hours are spent fuzzing. We typically store a seed corpus in RCS, maintain a larger corpus on the fuzzing service, and periodically merge it back to RCS.

Note: a corpus stored in RCS allows to perform regression testing (w/o fuzzing)

I also want to have cmd/cover built on compiler instrumentation to support branch coverage, but that's off-topic to this issue.

Not too much off-topic. This approach in LLVM allows us to get various kinds of coverage data with the same compiler instrumentation, by just re-implementing the callbacks.

I'm fine with fuzzing, but the problem is that if you vendor in a library that fuzzes, then... you inherit all their corpus. So, I'm not a fan of corpus being checked into the project.

This is a price worth paying since the corpus often turns out to be a great regression test.

I have concerns about how much time this is going to add to testing. My experience with fuzzing is that compiling with the fuzz instrumentation takes a significant amount of time. I'm not sure this is something we want to inflict upon every use of go test.

If you don't enable fuzzing instrumentation (which won't be on by default, I think) you won't pay for it.

kcc commented 7 years ago

A separate topic worth thinking about is fuzzing for equivalence between two implementations of the same protocol.

Imagine your code has func ReferenceFoo(data []byte) SomeType and func ExperimentalOptimizedFoo(data []byte) SomeType.

Then you can fuzz the following target to verify that the two implementations match:

func Fuzz(data []byte) int {
    if ReferenceFoo(data) != ExperimentalOptimizedFoo(data) {
       panic("ouch!")
    }
    return 0
}

This works pretty well when both things are implemented in Go. But imagine you are porting to Go something previously written in C. Here is a write up that describes one possible solution: https://moderncrypto.org/mail-archive/hacs/2017/000001.html (in short: have two processes running in parallel and exchanging data via shared memory or some such)

FiloSottile commented 7 years ago

I love this.

And I think a good solution to the corpus location, like

defaulting to testdata/FuzzXxx/
run (only) the corpus cases w/o flags

would

remove the need to duplicate code to "freeze" certain testcases
avoid sacrificing the API to fit it in a testing.T
be a more elegant solution that doesn't require putting binary data in source files

Projects that don't commit the corpus could use -fuzzcorpus (or similar) when fuzzing, and then copy the test cases they want to run every time in the testdata folder and check them in.

Actual fuzzing could be controlled by -fuzztime (like -benchtime).

minux commented 7 years ago

Fuzzing is basically coverage based randomized testing, and we already have a randomized testing package: testing/quick.

I think we shouldn't limit ourselves to just fuzzing a []byte, the fuzz function should take arbitrary type supported by testing/quick. i.e. if I'm fuzzing an integer sort routine, I should be able to write:

func FuzzIntSort(f *testing.F, input []int) { // ... }

Also, if the function takes a struct, this also opens door to use field tags (or even methods) to hint/provide non-trivial data validity requirements to the fuzz engine in order to avoid having the engine discover the restriction itself.

Limiting the input to []byte is more suited to protocol/parser fuzzing, but with proper Go tool support, we can do significantly better. Perhaps we can even merge this "coverage-based" part into testing/quick itself.

Just some random (no pun intended) idea to think about.

dvyukov commented 7 years ago

@dsnet No, I meant just plain random, not instrumentation involved. Consider that you just wrote a Fuzz function (or checked out some code without corpus), now you can do 'go test' and already get some results from the Fuzz function.

dvyukov commented 7 years ago

@kcc re fuzzing for equivalence Testing Go vs C is simple with cgo, there is an example of testing regexp vs re2 in go-fuzz. Testing several Go packages against each other is also trivial as there are no name clashes, just import both package. So I don't think we need to do something special for this in the first version.

dvyukov commented 7 years ago

@minux Yes, this is fittable into testing/quick. I can't make my mind as to whether we need to fit it into testing/quick or not yet. On one hand, (1) quick gives suboptimal interface, (2) we can't support all testing/quick tests (e.g. calling quick.Value twice per test, or with different types, is problematic), (3) we can't have any fuzzing-specific user API since these are just normal tests. But on the other hand, testing.TB seems to provide everything we need (with t.Skip as "discard this input"). So it would be nice to make them just normal tests with no new APIs. Any thoughts?

stryke commented 7 years ago

I like the idea a lot. I caught notice of a presentation by Dmitry (perhaps in the Go Newsletter?) some time ago and watched it and was as amazed and enthusiastic as the audience. I stick pretty close to the standard library though, mostly out of proximity and familiarity and wanting to keep things simple, and only venture out when forced by a clear benefit. All that is to say if fuzzing is as useful as it looked to be and it were in the standard library, I feel it would likely reach many more users of Go.

dolmen commented 7 years ago

With the current go-fuzz I wish I could:

write multiple Fuzz functions in a package. The workaround is to write Fuzz functions in separate packages, but this blocks from fuzzing directly private functions of the package.
reuse functions I already wrote in *_test.go sources for testing in the Fuzz function
reuse existing tests to feed a fuzzing corpus: this must be done "manually" for now (take values in Go code and save them in data files) and it's hard to keep them in sync. For example if I add a new input test case I wish it could be added immediately (without duplicating it) to the fuzzing corpus.
more easily cleanup the irrelevant input from the corpus after a functional change either in the fuzzed code or in the Fuzz function

So integrating fuzzing in *_test.go sources and giving "go test" the options to control them would be great, at least for 1. and 2. !

dolmen commented 7 years ago

Note also that fuzzing is currently mostly used to find edge cases that crash programs. But it could also be used to find worst case scenarios (speed, memory usage) in algorithms: this could then be used for benchmarking. So far we can share code between testing and benchmarking. Fuzzing support should be added along testing, benchmarking to benefit to the trio, not just testing.

rsc commented 7 years ago

There is still not much technical detail here about what is being proposed. It's hard to evaluate without concrete details of what is wanted. Especially with all the details about extra directories full of input corpus files, this starts to seem something heavy weight.

The open questions seem to be:

What is being proposed?
Does this need to be in the standard distribution? Is there some reason it doesn't work as a third-party tool?

Note that even a third-party tool could look for func FuzzFoo([]byte) functions in *_test.go files. That would allow someone to prototype all of this outside the standard distribution.

rsc commented 7 years ago

Still waiting for answers to the open questions.

My suggestion for moving forward would be to change go-fuzz to be - as much as possible - exactly what you want the new go test fuzz mode to be. That means that go-fuzz image/png and cd image/png; go-fuzz need to work sensibly, with no arguments or other preparation (currently go-fuzz needs a lot more than that).

The go command already ignores testdata directories, so go-fuzz could assume corpus data in testdata/FuzzName.fuzz or testdata/fuzz/FuzzName.corpus or anything like that. I'm still worried about the corpus data overwhelming the source code repos, but we would find out.

The go command also ignores functions with names other than TestName and BenchmarkName, so the fuzz functions can be written in the test files and will be ignored by the standard tools. I'd suggest that go-fuzz should expected func FuzzFoo(...).

If you want to allow the fuzzed function to take a *testing.F for error reporting in the long term, you could start with using testing.TB instead. Note that you can implement a testing.TB even though it has unexported methods, like this:

type fuzzState struct {
    ... my state ...
    testing.TB
}

func (f *fuzzState) Error(args ...interface{}) { ... }
... etc implementing all the exported methods of testing.TB ...

This way, a *fuzzState implements testing.TB and can be passed to the FuzzFoo functions, the FuzzFoo functions don't need to import anything but "testing" to declare their arguments. Neither can get at the unexported methods, so the fact that they panic (because they invoke methods of the nil embedded TB in fuzzState) doesn't matter.

dvyukov commented 7 years ago

Here is full-fledged proposal: https://docs.google.com/document/u/1/d/1zXR-TFL3BfnceEAWytV8bnzB2Tfp6EPFinWVJ5V4QC8/pub

omeid commented 7 years ago

This is pretty small, but instead of FuzzXxx(f *testing.F, data []byte) just FuzzXxx(f *testing.F) would be nice, and the data could be part of *testing.F.Data() []byte or some such. It will be more consistent with testing and bench, and I also support calling a method to acquire the data, which would allow the fuzzer to be configured before generating data.

DavidVorick commented 7 years ago

If you do it that way it's also really easy to add support for more types later, F.Int() and such

dvyukov commented 7 years ago

@omeid @DavidVorick One thing that I wanted to allow is future extension to:

func FuzzRegexp(f *testing.F, re string, data []byte, posix bool) { ... }

and:

func FuzzIntSort(f *testing.F, input []int) { ... }

Another aspect is that fuzzer needs a stable structure of the input. We can't allow calling f.Data() on one invocation and f.Int() on another; or calling multiple of them depending on some dynamic conditions in the fuzz function.

One alternative that was mentioned in this issue is:

func TestFoo(t *testing.T) {
  var data someStruct
  testing.GetRandomData(&data)
  // use data
}

This can work (provided that user is required to call GetRandomData once and only once and always with the same type). But I don't see why this alternative is considerably better than what I described in the proposal. The proposed approach is more concise and intuitive I think.

Kubuxu commented 7 years ago

~~One question, if signature conaines []byte, how its length is determined?~~

Scratch that, I just reminded myself how the fuzzing works currently with go-fuzz, it will try longer and longer inputs if it gets a failure for shorter ones.

rsc commented 7 years ago

Is there some reason this can't be prototyped outside the main repo?

rsc commented 7 years ago

It seems like the best way forward would be to migrate go-fuzz to be as close as possible to the intended command-line for the standard library. For example, right now you have to do go-fuzz-build and then go-fuzz. It should be go-fuzz [options] [package] just like go test, one step. Also it can add support for non-[]byte fuzzing.

We can even try adding Fuzz functions recognized by go-fuzz into the _test.go files in the main repo, along with testdata/fuzz subdirectories for corpus. Starting to do that will make clear exactly how heavy-weight this might turn out to be. (We don't want to bloat the main repo if this turns out to be a lot of files or bytes.)

I understand we can't do *testing.F this way, but testing.TB is a decent start I think (it gets you logging at least). See my comment above for how to make that work.

dolmen commented 7 years ago

@dvyukov I like your proposal. Some points need clarification:

-fuzzdir value: system filepath (absolute or relative) or package path? dir implies the former.
-fuzzinput value: filename (relative the corpus directory) or filepath or raw input?

Also I think that some parameters to limit the global fuzzing duration could be useful when running fuzzing in continuous integration build: specify to run the fuzzing function at most n times (go test has -count) or for this duration (-fuzzduration?) and giving multiple such limits would make fuzzing stop when the first limit is reached.

We should also consider the exit code returned by go test -fuzz when limits have been reached. What about returning a special exit code if the corpus has grown (as this requires user attention)?

Update: about -fuzzduration, I see that @FiloSottile suggested -fuzztime above.

dolmen commented 7 years ago

@dvyukov There is a single point in your proposal with which I disagree. It is in this paragraph:

go test runs fuzz functions as unit tests. Fuzz functions are selected with -run flag on par with tests (i.e. all by default). Fuzz functions are executed on all inputs from the corpus and on some amount of newly generated inputs (for 1s in normal mode and for 0.1s in short mode). For that matter, -fuzzdir flag can be specified without -fuzz flag.

I think that running Fuzz functions during go test using the existing corpus would be a great feature and would be a good reason to integrate fuzzing into go test. But I disagree with the idea of fuzzing when the -fuzz flag is not given. go test should keep its deterministic result.

dolmen commented 7 years ago

@rsc go-fuzz-build is currently quite long. Especially because package instrumentation is not cached (no go install for go-fuzz). And it is convenient to be able to stop fuzzing and restart later without enduring the build step again. Here are some result on one $work project:

$ time go build

real    0m1.798s
user    0m2.200s
sys 0m0.140s
$ time go-fuzz-build $(go list)

real    1m13.105s
user    1m20.688s
sys 0m19.376s

So integrating go-fuzz-build with go-fuzz would not yet be a good step forward.

Kubuxu commented 7 years ago

Also currently go-fuzz coverage instrumentation doesn't work with CGO builds which happen even when using some functions from stdlib. This might be solvable by epxosing coverage engine used in go test -cover for go-fuzz.

dolmen commented 7 years ago

@Kubuxu Note that the coverage engine of go test -cover is line-based. This is not precise enough for serious fuzzing as all various branches on a single source line are not counted separately. However I don't know if go-fuzz instrumentation is different.

dvyukov commented 7 years ago

@dolmen

-fuzzdir value: system filepath (absolute or relative) or package path? dir implies the former.

It says directory, it's the same phrasing that all other go command flags use. How do you propose to change it?

-fuzzinput value: filename (relative the corpus directory) or filepath or raw input?

Added "Flag value specifies path to a file with the input".

Also I think that some parameters to limit the global fuzzing duration could be useful when running fuzzing in continuous integration build: specify to run the fuzzing function at most n times (go test has -count) or for this duration (-fuzzduration?) and giving multiple such limits would make fuzzing stop when the first limit is reached.

What scenario do you have in mind? We did not hit need in such functionality in any of our setups.

We should also consider the exit code returned by go test -fuzz when limits have been reached. What about returning a special exit code if the corpus has grown (as this requires user attention)?

What is the scenario/workflow? Just don't want to over-engineer it. It's easy to add features later, impossible to remove. Set of things that it may need to communicate can be large (new inputs, new crashes, did not get any coverage, other errors). And in most cases the process is killed, do it does not have a chance to communicate exit status.

dvyukov commented 7 years ago

@dolmen

I think that running Fuzz functions during go test using the existing corpus would be a great feature and would be a good reason to integrate fuzzing into go test. But I disagree with the idea of fuzzing when the -fuzz flag is not given. go test should keep its deterministic result.

Good point. What do others think? I am open to discussion. The situation that I afraid of is that if it's difficult to enable (and in some cases difficult means adding one flag, because e.g. people may not know about it), then it will be underused. Lots of people also explicitly expressed interest in using the corpus as base of regression tests and running it on every 'go test' invocation. We have options of running/non-running random inputs on top of corpus; and maybe changing behavior based on presence of -short flag.

golang / go

proposal: cmd/go: make fuzzing a first class citizen, like tests or benchmarks #19109