Closed Groxx closed 2 years ago
Changes Missing Coverage | Covered Lines | Changed/Added Lines | % | ||
---|---|---|---|---|---|
internal/context.go | 13 | 14 | 92.86% | ||
<!-- | Total: | 13 | 14 | 92.86% | --> |
Files with Coverage Reduction | New Missed Lines | % | ||
---|---|---|---|---|
internal/internal_task_pollers.go | 2 | 80.67% | ||
<!-- | Total: | 2 | --> |
Totals | |
---|---|
Change from base Build fc712f9a-e71b-4295-8d09-ed2a531ba0d8: | 0.04% |
Covered Lines: | 12180 |
Relevant Lines: | 19281 |
:\ no, I think this may change execution order, given the order of propagation. gonna try something else.
Were you able to repro the issue in your test?
@demirkayaender not the one I was aiming for, unfortunately. I found yet another instead :|
❯ go test -test.run TestContext_RaceRegression_2 -count 1000 ./internal
fatal error: concurrent map writes
goroutine 648 [running]:
runtime.throw({0x1aa1d1f, 0x1000000017c9779})
/usr/local/Cellar/go/1.17.1/libexec/src/runtime/panic.go:1198 +0x71 fp=0xc000071a58 sp=0xc000071a28 pc=0x1036e91
runtime.mapassign(0xc0005043c0, 0xc00048e230, 0x0)
/usr/local/Cellar/go/1.17.1/libexec/src/runtime/map.go:585 +0x4d6 fp=0xc000071ad8 sp=0xc000071a58 pc=0x10105d6
go.uber.org/cadence/internal.propagateCancel({0x1bd7768, 0xc0005043c0}, {0x1bd0620, 0xc00048e230})
/Users/stevenl/gocode/src/go.uber.org/cadence/internal/context.go:233 +0x11c fp=0xc000071b30 sp=0xc000071ad8 pc=0x169d1dc
go.uber.org/cadence/internal.WithCancel({0x1bd7768, 0xc0005043c0})
/Users/stevenl/gocode/src/go.uber.org/cadence/internal/context.go:194 +0xbc fp=0xc000071b80 sp=0xc000071b30 pc=0x169ce5c
go.uber.org/cadence/internal.TestContext_RaceRegression_2.func1.1.1()
/Users/stevenl/gocode/src/go.uber.org/cadence/internal/context_test.go:76 +0x2a fp=0xc000071bb0 sp=0xc000071b80 pc=0x17c0daa
runtime.deferCallSave(0xc000071c78, 0xc000071ef8)
/usr/local/Cellar/go/1.17.1/libexec/src/runtime/panic.go:950 +0x82 fp=0xc000071bc0 sp=0xc000071bb0 pc=0x1036622
runtime.runOpenDeferFrame(0x100000001006e2d, 0xc00059ee10)
/usr/local/Cellar/go/1.17.1/libexec/src/runtime/panic.go:889 +0x27b fp=0xc000071c40 sp=0xc000071bc0 pc=0x1035fbb
runtime.Goexit()
I haven't yet reproduced their mapiternext
crash.
Regrettable but pretty simple to add. When goroutines are shut down correctly, these locks can (and should) be removed.
.... and then I discovered that Close has synchronous side effects. I'm not super confident that this is stable, I'm not familiar enough with all the code to know for sure :|
Unfortunately it looks like #1124 is not sufficient in practice, there are other races in our users' workflows :\ This is a (nearly) direct mimicry of go's context.go locking in 1.13 (our oldest supported version), but beyond that I have not really thought about the details. It fixes the new test's new race though, at least.