golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
122.8k stars 17.51k forks source link

runtime: confusing panic on parallel calls to yield function #68897

Open hauntedness opened 3 weeks ago

hauntedness commented 3 weeks ago

Go version

go1.23.0-windows-amd64

Output of go env in your module/workspace:

set GO111MODULE=on
set GOARCH=amd64
set GOBIN=
set GOCACHE=C:\Users\hauntedness\AppData\Local\go-build
set GOENV=C:\Users\hauntedness\AppData\Roaming\go\env
set GOEXE=.exe
set GOEXPERIMENT=rangefunc
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOINSECURE=
set GOMODCACHE=C:\Users\hauntedness\go\pkg\mod
set GONOPROXY=
set GONOSUMDB=
set GOOS=windows
set GOPATH=C:\Users\hauntedness\go
set GOPRIVATE=
set GOPROXY=https://goproxy.cn,direct
set GOROOT=C:\Program Files\Go
set GOSUMDB=sum.golang.org
set GOTMPDIR=
set GOTOOLCHAIN=auto
set GOTOOLDIR=C:\Program Files\Go\pkg\tool\windows_amd64
set GOVCS=
set GOVERSION=go1.23.0
set GODEBUG=
set GOTELEMETRY=on
set GOTELEMETRYDIR=C:\Users\hauntedness\AppData\Roaming\go\telemetry
set GCCGO=gccgo
set GOAMD64=v1
set AR=ar
set CC=gcc
set CXX=g++
set CGO_ENABLED=1
set GOMOD=D:\temp\dot\go.mod
set GOWORK=
set CGO_CFLAGS=-O2 -g
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-O2 -g
set CGO_FFLAGS=-O2 -g
set CGO_LDFLAGS=-O2 -g
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -mthreads -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=C:\Users\HAUNTE~1\AppData\Local\Temp\go-build2719328347=/tmp/go-build -gno-record-gcc-switches

What did you do?

package main

import (
    "fmt"
    "sync"
)

func main() {
    for i := range WaitAll(10) {
        fmt.Println(i)
    }
}

func WaitAll(total int) func(yield func(i int) bool) {
    return func(yield func(i int) bool) {
        wg := &sync.WaitGroup{}
        wg.Add(total)
        for i := range total {
            go func(j int) {
                defer wg.Done()
                if !yield(j) {
                    panic("Shouldn't be broken.")
                }
            }(i)
        }
        wg.Wait()
    }
}

See https://go.dev/play/p/_X0iHMy6lH1

What did you see happen?

================== panic: runtime error: range function continued iteration after loop body panic

goroutine 21 [running]: main.main-range1(0x2) D:/temp/dot/play/iter.go:9 +0xf3 main.main.WaitAll.func1.1(0x2) D:/temp/dot/play/iter.go:21 +0x8c created by main.main.WaitAll.func1 in goroutine 1 D:/temp/dot/play/iter.go:19 +0x8c exit status 2

What did you expect to see?

No panic

hauntedness commented 3 weeks ago

found a weired gowrap1 if run with race detector

go run -race ./play/

WARNING: DATA RACE Read at 0x00c00000a198 by goroutine 8: main.main-range1() D:/temp/dot/play/iter.go:10 +0x44 main.main.WaitAll.func1.1() D:/temp/dot/play/iter.go:24 +0x8b main.main.WaitAll.func1.gowrap1() D:/temp/dot/play/iter.go:27 +0x41

Previous write at 0x00c00000a198 by goroutine 7: main.main-range1() D:/temp/dot/play/iter.go:10 +0x58 main.main.WaitAll.func1.1() D:/temp/dot/play/iter.go:24 +0x8b main.main.WaitAll.func1.gowrap1() D:/temp/dot/play/iter.go:27 +0x41

Goroutine 8 (running) created at: Goroutine 8 (running) created at: main.main.WaitAll.func1() main.main.WaitAll.func1() D:/temp/dot/play/iter.go:22 +0x8b D:/temp/dot/play/iter.go:22 +0x8b main.main() D:/temp/dot/play/iter.go:10 +0x142

Goroutine 7 (running) created at: main.main.WaitAll.func1() D:/temp/dot/play/iter.go:22 +0x8b main.main() D:/temp/dot/play/iter.go:10 +0x142

gabyhelp commented 3 weeks ago

Related Issues and Documentation

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

ianlancetaylor commented 3 weeks ago

I can recreate this on my laptop, but only if I use go run -race.

It's hard to see how parallel calls to a yield function could work, but we should see if we can produce a better error message for this case.

thediveo commented 3 weeks ago

you are calling yield from a separate go routine, so the panic seems to be appropriate; not least as there are no guarantees when the separate go routine is scheduled, so well after the main go routine quickly finished the loop. The message looks clear and correct to me.

ianlancetaylor commented 3 weeks ago

CC @dr2chase

hauntedness commented 3 weeks ago

you are calling yield from a separate go routine, so the panic seems to be appropriate; not least as there are no guarantees when the separate go routine is scheduled, so well after the main go routine quickly finished the loop. The message looks clear and correct to me.

But there is a (*sync.WaitGroup).Wait(), why would the main goroutine be finished without respecting the waitgroup? And let me add more comments. The same code works well for go 1.22.6 And also it works well for below code:

func main() {
    WaitAll(10)(func(i int) bool {
        fmt.Println(i)
        return true
    })
}
thediveo commented 3 weeks ago

remove everything related to the WaitGroup then it becomes hopefully clear that you are creating Go routines in the range body that then call yield. The wait group doesn't matter; misplaced yield call? Maybe you want to discuss this in the forum, what you intend to achieve here?

hauntedness commented 3 weeks ago

remove everything related to the WaitGroup then it becomes hopefully clear that you are creating Go routines in the range body that then call yield. The wait group doesn't matter; misplaced yield call? Maybe you want to discuss this in the forum, what you intend to achieve here?

I want to simplify the WaitGroup boilerplate. I think the wait group does matter, because I want it to wait all of the yield call.

ianlancetaylor commented 3 weeks ago

@thediveo I agree that some panic would be acceptable for this program, but "range function continued iteration after loop body panic" is just inaccurate. The loop body did not panic.

ianlancetaylor commented 3 weeks ago

@hauntedness Why do you want to call the yield function in parallel in different goroutines?

timothy-king commented 3 weeks ago

The requirement is that a yield function may be called if all calls to the yield have returns true and f has not terminated.

This bans parallel calls. (Not calls from multiple Go routines.) WaitAll is violating this. For parallel calls you can still write:

WaitAll(10)(func(i int) bool {
    fmt.Println(i)
    return true
})

A panic happening here is WAI.

I agree that some panic would be acceptable for this program, but "range function continued iteration after loop body panic" is just inaccurate. The loop body did not panic.

I am about 90% sure this is the state variable for the loop being being set to the state that detects early exits (panics) still being set when a parallel call happens and it is read. A different wording seems appropriate.

hauntedness commented 3 weeks ago

@hauntedness Why do you want to call the yield function in parallel in different goroutines?

I want less boilerplate... And I did this in go 1.22 and that time I ever checked the orginal propose. I remember the spec did say something related to multiple goroutine but it's not forbiden doing that.

hauntedness commented 3 weeks ago

The requirement is that a yield function may be called if all calls to the yield have returns true and f has not terminated.

This bans parallel calls. (Not calls from multiple Go routines.) WaitAll is violating this. For parallel calls you can still write:

WaitAll(10)(func(i int) bool {
  fmt.Println(i)
  return true
})

A panic happening here is WAI.

I agree that some panic would be acceptable for this program, but "range function continued iteration after loop body panic" is just inaccurate. The loop body did not panic.

I am about 90% sure this is the state variable for the loop being being set to the state that detects early exits (panics) still being set when a parallel call happens and it is read. A different wording seems appropriate.

Thanks for those information but it will not always panic. eg. Below doesn't panic

func main() {
    sum := 0
    for i := range WaitAll(10) {
        sum += i
    }
    fmt.Println(sum)
}
hauntedness commented 3 weeks ago

@timothy-king Can you give more illustration? Which point the WaitAll violate? The waitgroup guarantee all the yield call even it panics. The yield call may return false, but here it doesn't. And f ofcourse would not terminate ealier than yield The requirement is that a yield function may be called if all calls to the yield have returns true and f has not terminated.

thediveo commented 3 weeks ago

The requirement is that a yield function may be called if all calls to the yield have returns true and f has not terminated.

There is to the best of my knowledge no specification given as to the temporal behavior of overlapping yields and this probably includes the invisible code inserted behind the scenes to check for out of range yield calls. This might fit in with the race detector getting tripped.

If I understand this correctly then your code happens to work on some systems some times, but might have never been guaranteed by the spec to be supported.

ianlancetaylor commented 3 weeks ago

@hauntedness The rule that @timothy-king stated is "a yield function may be called if all calls to the yield have returns true and f has not terminated."

Your program does not obey that rule. It calls the yield function even though a previous call to the yield function has not yet returned. That is, it makes multiple calls to the yield function in parallel.

The fact that in some cases your program does not panic does not mean that your program is following the rules.

hauntedness commented 3 weeks ago

@ianlancetaylor Ok, you mean the yield calls are not independent, they must be called sequential or at least sequential access some internal shared state right? As it is supper hiden rule here, do we consider to add a vet? I believe lot's of people would do it similar。

ianlancetaylor commented 3 weeks ago

Yes: you can't call yield until the previous call to yield has returned. In the language spec this is written as (https://go.dev/ref/spec#For_statements):

After each successive loop iteration, yield returns true and may be called again to continue the loop.

I think it's too early to say whether we need a vet check for this. We only want vet checks for errors that are frequently made. This strikes me as unlikely to be frequent.

I also don't know how we could write an accurate vet check here.

timothy-king commented 3 weeks ago

Thanks for those information but it will not always panic. eg. Below doesn't panic

It should be able to. You probably are just not getting unlucky. Try increasing the constant to a few hundred thousand and turning on -race.

Which point the WaitAll violate?

For WaitAll(10), after the second iteration there will be two newly go routines. So 3 goroutines: the original, the goroutine created for j==0, and the goroutine created for j==1. Now suspend the original go routine, schedule the j==1 goroutine and execute to the body of the yield (right before fmt.Println), and now schedule j==1 goroutine, and execute until it calls yield. The call the yield from j==0 has not returned true yet. This is a violation.

There is to the best of my knowledge no specification given as to the temporal behavior of overlapping yields and this probably includes the invisible code inserted behind the scenes to check for out of range yield calls.

The implementations definitely believe this is the requirement. The spec hints at this (everything after "If yield is called before f returns," is calling out specific conditions). IMO could/should spell this out more. (IMO the even bigger hole in the spec is that f is required to preserve panicking from yield. That is totally underspecified, and the implementation chooses to panic right now.)

Ok, you mean the yield calls are not independent, they must be called sequential or at least sequential access some internal shared state right?

Correct. The code below sequences yield calls correctly (even if it 'misses the point'):

func WaitAllCorrect(total int) func(yield func(i int) bool) {
    return func(yield func(i int) bool) {
        wg := &sync.WaitGroup{}
        wg.Add(total)
        var mu sync.Mutex // guards yield and done
        var done bool
        for i := range total {
            go func(j int) {
                defer wg.Done()
                mu.Lock()
                defer mu.Unlock()
                if !done && !yield(j) {
                    done = true
                    panic("Shouldn't be broken.")
                }
            }(i)
        }
        wg.Wait()
    }
}

As it is supper hiden rule here, do we consider to add a vet? I believe lot's of people would do it similar。

Maybe. We discussed this before the release. The discussion concluded that we need more evidence of what to check for before creating a proposal/checker.

As for "super hidden", we can advertise this better in the spec or a future blog post.

thediveo commented 3 weeks ago

There is to the best of my knowledge no specification given as to the temporal behavior of overlapping yields and this probably includes the invisible code inserted behind the scenes to check for out of range yield calls.

The implementations definitely believe this is the requirement. The spec hints at this (everything after "If yield is called before f returns," is calling out specific conditions).

I actually meant what you so succinctly put into words, where I utterly failed with mine.

mknyszek commented 3 weeks ago

CC @golang/compiler