etcd-io / raft

Raft library for maintaining a replicated state machine
Apache License 2.0
666 stars 164 forks source link

fuzzing: add raft fuzzer from cncf-fuzzing #55

Open AdamKorcz opened 1 year ago

AdamKorcz commented 1 year ago

Adds the fuzzer from https://github.com/etcd-io/etcd/pull/14674.

It is the same fuzzer but added here since the raft dir no longer exists in etcd core.

AdamKorcz commented 1 year ago

I read all comments in prior PR, seems like everything is resolved except dependency concerns. Can you please confirm that everything is addressed?

I should say so, since the deps that the previous PR bumped are added in this PR and not bumped. The issue with the deps in the last PR was related to bumping existing libraries.

serathius commented 1 year ago

Can you add a Github workflow for running the fuzzer each PR? Example https://github.com/etcd-io/etcd/blob/b1e14c7d0aa596e0337c5165025bddca344d1a0a/.github/workflows/fuzzing.yaml

AdamKorcz commented 1 year ago

@serathius I have minimized the dependencies. Could you have a look?

serathius commented 1 year ago

cc @tbg @ahrtr @pavelkalinnikov who are more familiar with raft repository.

AdamKorcz commented 1 year ago

The dependencies seem good - but now we're using what looks a lot like a fork that might one day accidentally go away, at which point the module would no longer build unless dependencies were cached. This isn't as much a problem for the repo - we can just rip out the dependency at that point since it's always going to be a testing-only one - but it would be a significant annoyance for developers who are working with an older (and then defunct) version of the library.

It's difficult for me to make a call here. If the code were inlined into the raft repo (licenses permitting), i.e. "just" put into fuzz_test.go or a util package, this problem would go away.

I am currently working on a release in my fork that will get merged upstream with 1-2 weeks. Could we use the upstream repo with a replace directive in raft's go.mod?

tbg commented 1 year ago

I am currently working on a release in my fork that will get merged upstream with 1-2 weeks. Could we use the upstream repo with a replace directive in raft's go.mod?

I don't want to hold up the train, but if it's only for 1-2 weeks, does it make sense to wait? We can pull upstream in directly and merge this PR then.

I also understand the desire to just get it in, and make the upstream adjustment separately. That's fine by me too, but I'd want @serathius or one of the other maintainers to support it as well.

serathius commented 1 year ago

I would wait until the change is properly upstreamed, I don't understand why after almost 3 quarters of fuzzer development (context) we decide to cut corners now.

cc @jeefy

I would even go further and want to avoid any dependency. I don't understand why we need whole library with random dependencies just for one struct that has less than 500 lines (code in consumers.go, half of it is for unrelated tar handling). why we can't just copy the one necessary struct and be done with it.

AdamKorcz commented 1 year ago

I would wait until the change is properly upstreamed, I don't understand why after almost 3 quarters of fuzzer development (context) we decide to cut corners now.

cc @jeefy

I would even go further and want to avoid any dependency. I don't understand why we need whole library with random dependencies just for one struct that has less than 500 lines (code in consumers.go, half of it is for unrelated tar handling). why we can't just copy the one necessary struct and be done with it.

Maintenance and duplication are the main downsides I can see from adding the utilities from go-fuzz-headers directly in the raft repo. Wrt maintenance: etcd will manually have to copy and paste the code over every time there are upstream changes. If the two versions divert over time, this will be tedious to maintain. Wrt duplication: If we should copy the utilities over to raft, then I assume we should do the same for main etcd, and then there are 2 versions to maintain. The duplication issue could naturally be solved if etcd main uses the fuzzing utilities from raft or vice versa.

I think we can circumvent the GenerateStruct api in general by using an Unmarshal routine instead. I have found this to be less efficient, but it eliminates the need for any fuzzing utilities.

The fuzzers from cncf-fuzzing use go-fuzz-headers extensively; it would be good to know what the road forward should be regarding introducing a dependency solely used for fuzzing - which is not a foreign thing to Kubernetes.

AdamKorcz commented 1 year ago

Changed to the upstream go-fuzz-headers. Please review again.

serathius commented 1 year ago

Don't have strong objections about adding a dependency just on github.com/AdaLogics/go-fuzz-headers.

What do you think? @tbg @ahrtr @dims

dims commented 1 year ago

+1 @serathius