golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.31k stars 17.7k forks source link

proposal: cmd/vet: add check for sync.WaitGroup abuse #18022

Open dsnet opened 8 years ago

dsnet commented 8 years ago

The API for WaitGroup is very easy for people to misuse. The documentation for Add says:

calls to Add should execute before the statement creating the goroutine or other event to be waited for.

However, it is very common to see this incorrect pattern:

var wg sync.WaitGroup
defer wg.Wait()

go func() {
    wg.Add(1)
    defer wg.Done()
}()

This usage is fundamentally racy and probably does not do what the user wanted. Worse yet, is that it is not detectable by the race detector.

Since the above pattern is common, I propose that we add a method Go that essentially does the Add(1) and subsequent call to Done in the correct way. That is:

func (wg *WaitGroup) Go(f func()) {
    wg.Add(1)
    go func() {
        defer wg.Done()
        f()
    }()
}
cznic commented 8 years ago

I don't like the idea. It's IMHO perhaps a task for go vet, if not implemented already. Also, the proposed method would add a(nother) closure layer when the go-ed function has parameters.

minux commented 8 years ago

I think this has been proposed before (probably on the mailing lists) and rejected.

dsnet commented 8 years ago

A go vet check seems pretty reasonable. I just tried it right now on the following:

func main() {
    var wg sync.WaitGroup
    defer wg.Wait()
    go func() {
        wg.Add(1)
        defer wg.Done()
    }()
}

and vet doesn't report anything.

In terms of vet's requirements:

@dominikh , staticcheck does report a problem: main.go:10:3: should call wg.Add(1) before starting the goroutine to avoid a race (SA2000) I'm wondering how accurate the check is and whether it is worth adding to vet.

mvdan commented 8 years ago

As far as the proposal goes, I don't like how it limits the func signature to func().

dominikh commented 8 years ago

@dsnet The check in staticcheck has no (known) false positives. It shouldn't have a significant number of false negatives, either. The implementation is a simple pattern-based check, detecting go with function literals where the first statement is a call to wg.Add – this avoids flagging wg.Add calls further down the goroutine, which tend to be valid uses.

I'm -1 on the proposed Go function. I'd prefer not having to read code with an unnecessary level of nesting that looks callback-esque and reminds me of JavaScript.

mvdan commented 8 years ago

@dominikh I don't see any extra level of nesting here, though (assuming any func signature is allowed).

To be nitpicky, another thing that stands out from the proposal is how wg.Go() will create a goroutine even though go is never directly used by the user. I don't know if the standard library does this anywhere else, but I would prefer if it was left explicit.

dominikh commented 8 years ago

@mvdan The extra level of nesting would come from a predicted usage that looks something like this:

wg.Go(func() {
  // do stuff
})

as opposed to

go func() {
  // do stuff
}()

Admittedly the same level of indentation, but syntactically it's one extra level of nesting.

mvdan commented 8 years ago

Ah yes, I was thinking indentation there.

cznic commented 8 years ago

The problematic case is that

wg.Add(1)
go func(i int) { ... }(42)

// becomes

wg.Go(func() {
        go func(i int) { ... }(42)
})
mvdan commented 8 years ago

@cznic if you mean without the extra go, this would be solved if the restriction on the func() signature was removed.

dominikh commented 8 years ago

@mvdan Do you mean by allowing something like the following?

wg.Go(func(x, y int) { ...}, v1, v2)

IMHO that's way too much interface{} and not enough type safety.

mattn commented 8 years ago

panic is recovered?

mvdan commented 8 years ago

@dominikh true; I was simply pointing at the issue without contemplating a solution :)

rsc commented 8 years ago

The API change here has the problems identified above with argument evaluation. Also, in general the libraries do not typically phrase functionality in terms of callbacks. If we're going to start using callbacks broadly, that should be a separate decision (and not one to make today). For both these reasons, it seems like .Go is not a clear win.

It would be nice to have a vet check that we trust (no false positives). Perhaps it is enough to look for two statements

wg.Add(1)
defer wg.Done()

back to back and reject that always. Thoughts about how to make vet catch this reliably?

renannprado commented 7 years ago

I agree that vet is better place for this. The proposed API reminds of JavaScript, which will force us many times to wrap the code or function within a function with no arguments, while you could have just go func().... Still nothing can stop you from creating such a helper methid, even though I don't see the need for it.

rsc commented 7 years ago

It sounds like we are deciding to make go vet check this and not add new API here. Any arguments against that?

dsnet commented 7 years ago

SGTM

rsc commented 4 years ago

I've added this proposal to the proposal process bin, but it's blocked on someone figuring out how to implement a useful check. Is anyone interested in doing that?

dominikh commented 4 years ago

Staticcheck has a fairly trivial check: for a GoStmt of a FuncLit, if the first statement in the FuncLit is a call to (*sync.WaitGroup).Add, we flag it. That has potential for false positives, but none have been reported in all the years that the check has existed.

The check could be trivially hardened by

  1. looking for an immediately following defer of (*sync.WaitGroup).Done and comparing the two receivers.
  2. checking that the argument to Add is 1, not some other number.

Edit: which is pretty much what you have suggested in https://github.com/golang/go/issues/18022#issuecomment-263397395

rsc commented 4 years ago

Thanks for the info @dominikh. Would it be OK with you for vet to do the same thing? Would you want to send the code?

rsc commented 4 years ago

Ping @dominikh. Thanks.

dominikh commented 4 years ago

Sorry @rsc, I missed your earlier comment. Sure, that's fine by me. I can send code if it helps, but I'd have to write it from scratch – the check in staticcheck makes use of staticcheck internals.

rsc commented 4 years ago

@dominikh sure, if you feel like sending code please do. Otherwise anyone else who wants to is welcome. I just want to make sure we're not overstepping versus staticcheck (which I'm a big fan of).

rsc commented 4 years ago

On hold for details of the eventual implemented check.

adonovan commented 1 week ago

I ran staticcheck's SA2000 analyzer across the module mirror corpus and found 57 matches (see attached file). A random 25 are shown here. All (!!) are true positives.

https://go-mod-viewer.appspot.com/github.com/kyma-project/kyma/components/asset-store-controller-manager@v0.0.0-20191203152857-3792b5df17c5/internal/controllers/suite_test.go#L38: should call wg.Add(1) before starting the goroutine to avoid a race https://go-mod-viewer.appspot.com/github.com/omniscale/go-osm@v0.3.1/parser/pbf/parser_test.go#L250: should call wg.Add(1) before starting the goroutine to avoid a race https://go-mod-viewer.appspot.com/github.com/omniscale/go-osm@v0.3.1/parser/pbf/parser_test.go#L320: should call wg.Add(1) before starting the goroutine to avoid a race https://go-mod-viewer.appspot.com/github.com/Go-To-Byte/DouSheng/user_center@v0.0.0-20230524130918-ad531c1a3f6a/apps/user/impl/impl.go#L88: should call wait.Add(1) before starting the goroutine to avoid a race https://go-mod-viewer.appspot.com/go.dedis.ch/cothority/v3@v3.4.9/eventlog/api.go#L257: should call wg.Add(1) before starting the goroutine to avoid a race https://go-mod-viewer.appspot.com/github.com/mholt/caddy-l4@v0.0.0-20241104153248-ec8fae209322/modules/l4proxyprotocol/matcher_test.go#L38: should call wg.Add(1) before starting the goroutine to avoid a race https://go-mod-viewer.appspot.com/github.com/mholt/caddy-l4@v0.0.0-20241104153248-ec8fae209322/modules/l4proxyprotocol/handler_test.go#L68: should call wg.Add(1) before starting the goroutine to avoid a race https://go-mod-viewer.appspot.com/github.com/mholt/caddy-l4@v0.0.0-20241104153248-ec8fae209322/modules/l4proxyprotocol/matcher_test.go#L64: should call wg.Add(1) before starting the goroutine to avoid a race https://go-mod-viewer.appspot.com/github.com/omniscale/go-osm@v0.3.1/parser/pbf/parser_test.go#L62: should call wg.Add(1) before starting the goroutine to avoid a race https://go-mod-viewer.appspot.com/github.com/mholt/caddy-l4@v0.0.0-20241104153248-ec8fae209322/modules/l4proxyprotocol/matcher_test.go#L90: should call wg.Add(1) before starting the goroutine to avoid a race https://go-mod-viewer.appspot.com/github.com/jtzjtz/kit@v1.0.2/conn/rabbitmq_pool/rabbitmq_pool.go#L121: should call S.wg.Add(1) before starting the goroutine to avoid a race https://go-mod-viewer.appspot.com/github.com/deso-protocol/core@v1.2.9/lib/txindex.go#L158: should call txi.updateWaitGroup.Add(1) before starting the goroutine to avoid a race https://go-mod-viewer.appspot.com/github.com/portworx/kvdb@v0.0.0-20241107215734-a185a966f535/test/kv.go#L537: should call wg.Add(1) before starting the goroutine to avoid a race https://go-mod-viewer.appspot.com/github.com/abiosoft/semaphore@v0.0.0-20240818083615-bc6b5b10c137/semaphore_test.go#L21: should call g.Add(1) before starting the goroutine to avoid a race https://go-mod-viewer.appspot.com/github.com/omniscale/go-osm@v0.3.1/parser/pbf/parser_test.go#L266: should call wg.Add(1) before starting the goroutine to avoid a race https://go-mod-viewer.appspot.com/github.com/Go-To-Byte/DouSheng/user_center@v0.0.0-20230524130918-ad531c1a3f6a/apps/user/impl/impl.go#L78: should call wait.Add(1) before starting the goroutine to avoid a race https://go-mod-viewer.appspot.com/github.com/lastbackend/toolkit@v0.0.0-20241020043710-cafa37b95aad/pkg/server/grpc/grpc.go#L258: should call g.wait.Add(1) before starting the goroutine to avoid a race https://go-mod-viewer.appspot.com/github.com/bluesky-social/indigo@v0.0.0-20241119181532-966c093275b7/cmd/sonar/main.go#L171: should call wg.Add(1) before starting the goroutine to avoid a race https://go-mod-viewer.appspot.com/github.com/omniscale/go-osm@v0.3.1/parser/pbf/doc_test.go#L48: should call wg.Add(1) before starting the goroutine to avoid a race https://go-mod-viewer.appspot.com/github.com/msales/pkg/v3@v3.24.0/clix/utils_test.go#L46: should call wg.Add(1) before starting the goroutine to avoid a race https://go-mod-viewer.appspot.com/github.com/omniscale/go-osm@v0.3.1/parser/pbf/parser_test.go#L329: should call wg.Add(1) before starting the goroutine to avoid a race https://go-mod-viewer.appspot.com/github.com/omniscale/go-osm@v0.3.1/parser/pbf/parser_test.go#L222: should call wg.Add(1) before starting the goroutine to avoid a race https://go-mod-viewer.appspot.com/github.com/omniscale/go-osm@v0.3.1/parser/pbf/parser_test.go#L70: should call wg.Add(1) before starting the goroutine to avoid a race https://go-mod-viewer.appspot.com/github.com/omniscale/go-osm@v0.3.1/parser/pbf/doc_test.go#L56: should call wg.Add(1) before starting the goroutine to avoid a race https://go-mod-viewer.appspot.com/github.com/zhufuyi/sponge@v1.10.3/pkg/ws/server_test.go#L88: should call wg.Add(1) before starting the goroutine to avoid a race

waitgroup.txt

timothy-king commented 1 week ago

https://go-mod-viewer.appspot.com/github.com/kyma-project/kyma/components/asset-store-controller-manager@v0.0.0-20191203152857-3792b5df17c5/internal/controllers/suite_test.go#L38

    33  // StartTestManager adds recFn
    34  func StartTestManager(mgr manager.Manager, g *GomegaWithT) (chan struct{}, *sync.WaitGroup) {
    ...
    37      go func() {
    38          wg.Add(1)
    ...
    41      }()
    42      return stop, wg
    43  }

Do we need to establish that wg.Wait() is called to issue a diagnostic? That would be what this "races" with. We possibly could just always issue a report when the first statement in the function is wg.Add(*). That would have some potential issues with very obscure cases (like wrapping wg.Wait() on its own goroutine and blocking progress through additional channels), but is probably pretty accurate.

adonovan commented 1 week ago

Do we need to establish that wg.Wait() is called to issue a diagnostic?

No. If someone is using WaitGroup without calling Wait, they have bigger problems.