vito / progrock

progress ui that rocks
Apache License 2.0
61 stars 6 forks source link

Concurrent calls to WriteStatus and Close can lead to concurrent map access #12

Open GeorgeMac opened 8 months ago

GeorgeMac commented 8 months ago

I noticed this in an application of ours using Dagger. My gut says this seems to be the right level to fix this issue, however, if calling these two methods concurrently is deemed unreasonable in the first place, then I am happy to move the issue up to Dagger itself.

fatal error: concurrent map iteration and map write

goroutine 1 [running]:
github.com/vito/progrock/console.(*textMux).print(0xc000313b00, 0xc000315400)
    /go/pkg/mod/github.com/vito/progrock@v0.10.2-0.20230913234310-64b4a1cfb007/console/printer.go:228 +0x145
github.com/vito/progrock/console.(*Writer).Close(0x6593e541?)
    /go/pkg/mod/github.com/vito/progrock@v0.10.2-0.20230913234310-64b4a1cfb007/console/writer.go:87 +0x25
github.com/vito/progrock.MultiWriter.Close({0xc000372230, 0x1, 0x0?})
    /go/pkg/mod/github.com/vito/progrock@v0.10.2-0.20230913234310-64b4a1cfb007/multi.go:19 +0x90
github.com/vito/progrock.(*Recorder).Close(...)
    /go/pkg/mod/github.com/vito/progrock@v0.10.2-0.20230913234310-64b4a1cfb007/recorder.go:254
github.com/dagger/dagger/engine/client.(*Client).Close(0xc000002600)
    /app/engine/client/client.go:399 +0x6e6
main.withEngineAndTUI({0x13f82c0, 0x1d1f960}, {{0x0, 0x0}, {0x0, 0x0, 0x0}, {0xc000269ad0, 0x24}, {0xc000269b00, ...}, ...}, ...)
    /app/cmd/dagger/engine.go:104 +0x3d7
main.run({0x13f82c0, 0x1d1f960}, {0xc00007dae0, 0x5, 0x5})
    /app/cmd/dagger/run.go:93 +0x22b
main.Run(0xc00036e000?, {0xc00007dae0?, 0x4?, 0x123a778?})
    /app/cmd/dagger/run.go:70 +0x33
github.com/spf13/cobra.(*Command).execute(0x1cde140, {0xc00007da90, 0x5, 0x5})
    /go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:944 +0x863
github.com/spf13/cobra.(*Command).ExecuteC(0x1cdca40)
    /go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3a5
github.com/spf13/cobra.(*Command).Execute(...)
    /go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:992
main.main()
    /app/cmd/dagger/main.go:97 +0x2c

goroutine 75 [semacquire]:
sync.runtime_Semacquire(0x30?)
    /usr/local/go/src/runtime/sema.go:62 +0x25
sync.(*WaitGroup).Wait(0xc0002926e0?)
    /usr/local/go/src/sync/waitgroup.go:116 +0x48
google.golang.org/grpc.(*Server).serveStreams(0xc00021dc20, {0x13ff0c0?, 0xc0002926e0})
    /go/pkg/mod/google.golang.org/grpc@v1.59.0/server.go:999 +0x1cc
google.golang.org/grpc.(*Server).ServeHTTP(0xc00021dc20, {0x13f6fa0?, 0xc0000bc760?}, 0x37?)
    /go/pkg/mod/google.golang.org/grpc@v1.59.0/server.go:1043 +0xe5
golang.org/x/net/http2.(*serverConn).runHandler(0x1d1f960?, 0x0?, 0x0?, 0xc00027f860?)
    /go/pkg/mod/golang.org/x/net@v0.18.0/http2/server.go:2368 +0xbb
created by golang.org/x/net/http2.(*serverConn).scheduleHandler in goroutine 46
    /go/pkg/mod/golang.org/x/net@v0.18.0/http2/server.go:2303 +0x21d

goroutine 78 [runnable]:
internal/reflectlite.Swapper.func4(0xa?, 0x9?)
    /usr/local/go/src/internal/reflectlite/swapper.go:46 +0xa5
sort.insertionSort_func({0xc000151408?, 0xc000568040?}, 0x0, 0xb)
    /usr/local/go/src/sort/zsortfunc.go:13 +0x66
sort.pdqsort_func({0xc000151408?, 0xc000568040?}, 0x0?, 0x1f?, 0xedd25dbc3?)
    /usr/local/go/src/sort/zsortfunc.go:73 +0x31b
sort.pdqsort_func({0xc000151408?, 0xc000568040?}, 0x0?, 0x49?, 0xedd25dbc3?)
    /usr/local/go/src/sort/zsortfunc.go:121 +0x293
sort.pdqsort_func({0xc000151408?, 0xc000568040?}, 0x0?, 0x1a5?, 0xc0001512d0?)
    /usr/local/go/src/sort/zsortfunc.go:121 +0x293
sort.pdqsort_func({0xc000151408?, 0xc000568040?}, 0x7f2024b4a5b8?, 0x18?, 0xc000070800?)
    /usr/local/go/src/sort/zsortfunc.go:121 +0x293
sort.Slice({0xff4740?, 0xc000516480?}, 0x1a5?)
    /usr/local/go/src/sort/slice.go:26 +0xf6
github.com/vito/progrock/console.sortCompleted(0xc000315400, 0xc000151608)
    /go/pkg/mod/github.com/vito/progrock@v0.10.2-0.20230913234310-64b4a1cfb007/console/printer.go:214 +0x1f3
github.com/vito/progrock/console.(*textMux).print(0xc000313b00, 0xc000315400)
    /go/pkg/mod/github.com/vito/progrock@v0.10.2-0.20230913234310-64b4a1cfb007/console/printer.go:247 +0x265
github.com/vito/progrock/console.(*Writer).WriteStatus(0xc0003139e0, 0xc000832820)
    /go/pkg/mod/github.com/vito/progrock@v0.10.2-0.20230913234310-64b4a1cfb007/console/writer.go:80 +0xc5
github.com/vito/progrock.MultiWriter.WriteStatus(...)
    /go/pkg/mod/github.com/vito/progrock@v0.10.2-0.20230913234310-64b4a1cfb007/multi.go:9
github.com/vito/progrock.(*RPCReceiver).WriteUpdates(0xc000517428, {0x13fde10, 0xc0000be4c0})
    /go/pkg/mod/github.com/vito/progrock@v0.10.2-0.20230913234310-64b4a1cfb007/rpc.go:97 +0x53
github.com/vito/progrock._ProgressService_WriteUpdates_Handler({0x109cd80?, 0xc000517428}, {0x13fbdb0?, 0xc0002fc000})
    /go/pkg/mod/github.com/vito/progrock@v0.10.2-0.20230913234310-64b4a1cfb007/progress_grpc.pb.go:104 +0x91
github.com/moby/buildkit/util/grpcerrors.StreamServerInterceptor({0x109cd80, 0xc000517428}, {0x13fbdb0?, 0xc0002fc000?}, 0xfde4a0?, 0x12bb988)
    /go/pkg/mod/github.com/moby/buildkit@v0.13.0-beta1.0.20231107172747-c4f191410a41/util/grpcerrors/intercept.go:33 +0x71
google.golang.org/grpc.(*Server).processStreamingRPC(0xc00021dc20, {0x13f8330, 0xc00032c4b0}, {0x13ff0c0, 0xc0002926e0}, 0xc000391440, 0xc0000c2d20, 0x1ccc4e0, 0x0)
    /go/pkg/mod/google.golang.org/grpc@v1.59.0/server.go:1644 +0x1327
google.golang.org/grpc.(*Server).handleStream(0xc00021dc20, {0x13ff0c0, 0xc0002926e0}, 0xc000391440)
    /go/pkg/mod/google.golang.org/grpc@v1.59.0/server.go:1741 +0xbfb
google.golang.org/grpc.(*Server).serveStreams.func1.1()
    /go/pkg/mod/google.golang.org/grpc@v1.59.0/server.go:986 +0x86
created by google.golang.org/grpc.(*Server).serveStreams.func1 in goroutine 75
    /go/pkg/mod/google.golang.org/grpc@v1.59.0/server.go:997 +0x145

When closing and writing statuses concurrently, it seems that the trace can be sorting some contents during WriteStatus, while it is being iterated on in the Close call.

vito commented 8 months ago

Interesting, thanks for the report + trace. Will have to think about this more, but I'm inclined to fix this in Progrock since this seems like the kind of thing that could be tricky to synchronize externally.

Interesting that I haven't run into this myself but it seems likely to be unique to the plain console writer, rather than the usual TUI. If that's the case maybe that's where the fix needs to go.

GeorgeMac commented 7 months ago

Just for posterity and debugging purposes, it happened again here https://github.com/flipt-io/flipt/actions/runs/7690639515/job/20954745617?pr=2687