Closed wolf0403 closed 6 months ago
This does seem like it would catch real problems if done well. /cc @adonovan
Actually, note that those captures - at least when they are definite bugs - should usually show up in the race detector. It's not clear vet can contribute more than the race detector does without significant false positives.
Although this bug pattern most often manifests with go
statements (leading to data races), sequential (non-racy) instances are possible too, such as with defer
or the simple example presented in this Issue.
vet already checks for patterns of the form:
for k, v := range seq {
go/defer func() {
... k, v ...
}()
}
The example presented in this Issue is more challenging to analyze because it requires proving that the function is not called within the loop, or at least failing to prove that it is called within the loop. Once the anonymous function has been stored in a data structure or passed to another function, vet can no longer precisely determine when it might be called.
In other words, I think the current vet check is probably as good as we can do without interprocedural analysis.
It seems to me like we're late to add a new vet check in 1.11, given that the tree is frozen.
Is anyone intending to work on this for 1.12?
Does anyone even know how to solve this problem? I don't.
I'm labelling as NeedsInvestigation, just to clarify that we're not sure that a fix is possible.
See also #20733, which would eliminate this requirement by defining the problem away.
Change https://golang.org/cl/164119 mentions this issue: cmd/link: do not close over the loop variable in testDWARF
How about we do this for the simple t.Parallel
case as a start? It happens often (just this week I've spotted two manually during code reviews at work), and I think a simple and conservative algorithm like the one in #18106 could have near-zero false positives.
I'd like to document a possible false positive with the algorithm as described in #18106:
for _, v := range x {
t1.Run("", func(t2 *testing.T) {
t2.Run("", func(t3 *testing.T) {
t3.Parallel()
println(v)
})
})
}
t1.Run will wait for t2 and t2's subtests to complete, so even though t3 is a parallel test and references a range variable, everything works correctly, as t3 only runs parallel to t2, not the loop. This would not be the case if we also called t2.Parallel.
Here's another example that go vet
does not catch. If you have nested loops and use a variable from the outer loop in a closure, go vet does not return the expected loop variable i captured by func literal
error:
https://play.golang.org/p/l3830Ij64qg
It might be nice to have go vet
optionally return an error anytime you define an anonymous func inside of a loop to prevent any closure bugs.
vet could easily be made a little smarter to catch @dmowcomber's example. The checker looks only at the last statement in the loop body because it can't prove that any later statements don't wait, but we could define "last" recursively: if the last statement is an if
statement, we could look at the last statements in both of its branches; or if the last statement is a nested loop, then we could look at the last statement of that loop's body; and so on for select, switch, etc.
Change https://golang.org/cl/184537 mentions this issue: go/analysis/passes/loopclosure: inspect inner loops, if stmts, etc
@alandonovan We have a version of the loopclosure check within Monzo that uses a bunch of heuristics to avoid false positives and has good coverage of most bugs. Across our codebase it has 2 false positives and detected more than 50 bugs that hadn't already been discovered.
We are comfortable that the safety this provides outweighs the very occasional false positive but I don't think it's accurate enough to meet the go vet bar. I'm keen to contribute something to go vet so am looking at exploring a modified version of the check that is targeted at:
t.Run
and t.Parallel
usage(*errgroup.Group).Go
alongside defer
and go
as unsafe uses of function literals at the end of a loopDoes this sound potentially reasonable? If so, I'll draft a change.
@alandonovan We have a version of the loopclosure check within Monzo that uses a bunch of heuristics to avoid false positives and has good coverage of most bugs. Across our codebase it has 2 false positives and detected more than 50 bugs that hadn't already been discovered.
We are comfortable that the safety this provides outweighs the very occasional false positive but I don't think it's accurate enough to meet the go vet bar. I'm keen to contribute something to go vet so am looking at exploring a modified version of the check that is targeted at:
improper
t.Run
andt.Parallel
usageadds
(*errgroup.Group).Go
alongsidedefer
andgo
as unsafe uses of function literals at the end of a loopDoes this sound potentially reasonable? If so, I'll draft a change.
@danielchatfield would you consider open sourcing this as a standalone linter? Possibly something that could be added to golangci-lint?
adds (*errgroup.Group).Go alongside defer and go as unsafe uses of function literals at the end of a loop
Is now supported in loopclosure.
improper t.Run and t.Parallel usage
This is something we have been considering adding support for in vet. This would most likely need to follow roughly the same pattern as loopclosure
does. Someone from the community could contribute this.
We are currently in 1.18 code freeze, but this may be able to make it into 1.19.
Fwiw the specific case that led me to this issue was that my loop contained an if statement as the last statement, and my goroutine(s) we're the last statements in each if branch.
@glenjamin Can you provide or point to a [possibly simplified] example? (This checker is quite sensitive to syntax to keep false positives down.)
@timothy-king sure - The body of the loop here is almost identical to my real code (I deleted a couple of lines of logging from the top of the loop body)
This compiles and passes vet, but is incorrect.
package main
import (
"context"
"sync"
"golang.org/x/sync/errgroup"
)
func main() {
var nodes []interface{}
ctx := context.Background()
critical, ctx := errgroup.WithContext(ctx)
others := sync.WaitGroup{}
for i, node := range nodes {
// i, node := i, node
if IsCritical(node) {
critical.Go(func() error {
return run(ctx, i, node)
})
} else {
others.Add(1)
go func() {
// We don't care about errors in non-critical nodes
_ = run(ctx, i, node)
others.Done()
}()
}
}
}
func IsCritical(node interface{}) bool {
return false
}
func run(ctx context.Context, i int, node interface{}) error {
return nil
}
@glenjamin That specific example is supportable in loopclosure
. This would require extending the logic of looking for the "last" statement within the RangeStmt to include statements that unconditionally lead to the end of the RangeStmt if they are executed. If branches fit into this. Both branches do not need to match. Just one.
IMO the justification is fairly solid. We can rework the above example into something I expect loopclosure
to warn on:
for i, node := range nodes {
// i, node := i, node
if IsCritical(node) {
critical.Go(func() error {
return run(ctx, i, node)
})
continue // minor refactor
}
others.Add(1)
go func() { // now the last statement
// We don't care about errors in non-critical nodes
_ = run(ctx, i, node)
others.Done()
}()
}
Again we welcome community contributions on this (and we are currently in the 1.18 freeze).
Actually, note that those captures - at least when they are definite bugs - should usually show up in the race detector.
Unfortunately @rsc , they don't, in a typical buggy table test case:
func TestParallel(t *testing.T) {
tests := []struct {name string}{{name: "case1"}, {name: "case2"}}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
t.Parallel()
fmt.Println(tt.name)
})
}
}
This buggy output and no warning from race detector:
$ go test -race -v .
=== RUN TestParallel
=== RUN TestParallel/case1
=== PAUSE TestParallel/case1
=== RUN TestParallel/case2
=== PAUSE TestParallel/case2
=== CONT TestParallel/case2
case2
=== CONT TestParallel/case1
case2
--- PASS: TestParallel (0.00s)
--- PASS: TestParallel/case2 (0.00s)
--- PASS: TestParallel/case1 (0.00s)
PASS
@porridge they would, if you run subtests in parallel.
@invidian do they for you?
Adding -test.parallel=3
to that command line does not help on my end.
Ah, I misread and haven't noticed t.Parallel()
. Race detector does not trigger indeed. Sorry for the noise.
We have a version of the loopclosure check within Monzo that uses a bunch of heuristics to avoid false positives and has good coverage of most bugs.
@danielchatfield any updates on making this available to the community?
@invidian, the t.Parallel
issue in general is #35670. (It masks other kinds of races too, not just races on loop variables.)
See #56010 for a different approach.
Change https://go.dev/cl/452155 mentions this issue: go/analysis/passes/loopclosure: recursively check last statements in statements like if, switch, and for
Given the interest in possibly redefining for loop variable semantics in #56010, I thought it might be useful to try to complete the sketch for extending the loopclosure checker that @adonovan suggested above a few years ago in https://github.com/golang/go/issues/16520#issuecomment-507426574.
Rather than the checker examining the last statement in a loop body, the "last" statement examined by the checker becomes defined recursively (e.g., if the last statement in the loop body is a switch statement, then the analyzer examines the last statements in each of the switch cases).
I sent CL https://go.dev/cl/452155, which does that, and now properly flags the example above from @dmowcomber.
One question @timothy-king raised in the Gerrit discussion is whether or not that change would need to go through the proposal process. Are there any quick opinions on that?
I'd be happy to open a proposal if that's the recommendation.
(I will likely open a separate proposal for some possible follow-on approaches, but my more immediate question is if CL 452155 should have a proposal).
Hi @glenjamin, FYI, https://go.dev/cl/452155 also seems to properly flag your example from above as well, and stops complaining if you uncomment the i, node := i, node
line:
func main() {
var nodes []interface{}
ctx := context.Background()
critical, ctx := errgroup.WithContext(ctx)
others := sync.WaitGroup{}
for i, node := range nodes {
// i, node := i, node
if IsCritical(node) {
critical.Go(func() error {
return run(ctx, i, node) // vet now warns i and node captured by func literal
})
} else {
others.Add(1)
go func() {
// We don't care about errors in non-critical nodes
_ = run(ctx, i, node) // vet now warns i and node captured by func literal
others.Done()
}()
}
}
}
func IsCritical(node interface{}) bool { return false }
func run(ctx context.Context, i int, node interface{}) error { return nil }
Given that this is improving the precision of an existing vet check, without introducing any new false positives, I don't believe it needs to go through the proposal process.
Change https://go.dev/cl/455195 mentions this issue: go/analysis/passes/loopclosure: allow certain well-understood statements to trail problematic go and defer statements
Is this no longer relevant with go1.22 / should be closed? @adonovan
The long-standing “for” loop gotcha with accidental sharing of loop variables between iterations is now resolved.
Yep. It is fine to close this with the 1.22 lifetime rules.
Please answer these questions before submitting your issue. Thanks!
go version
)? go version go1.6 darwin/amd64go env
)? GOARCH="amd64" GOBIN="" GOEXE="" GOHOSTARCH="amd64" GOHOSTOS="darwin" GOOS="darwin" GOPATH="/Users/ryangao/go" GORACE="" GOROOT="/Users/ryangao/homebrew/Cellar/go/1.6/libexec" GOTOOLDIR="/Users/ryangao/homebrew/Cellar/go/1.6/libexec/pkg/tool/darwin_amd64" GO15VENDOREXPERIMENT="1" CC="clang" GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fno-common" CXX="clang++" CGO_ENABLED="1"