Locally-dispatched activity test flakiness hopefully resolved

uber-go / cadence-client

Framework for authoring workflows and activities running on top of the Cadence orchestration engine.

https://cadenceworkflow.io

MIT License

345 stars 131 forks source link

Locally-dispatched activity test flakiness hopefully resolved #1240

Closed Groxx closed 1 year ago

Groxx commented 1 year ago

At least, I finally ran it more than 5x in a row without failure. yay! We should probably just ban bare atomics.

There are occasionally test races between test-completion and zap logging, but those are sorta un-resolvable unless we reliably stop everything in tests. Which would be excellent, but I think we're fairly far from achieving that at the moment.

Groxx commented 1 year ago

racy tests: if you mean to let the race detector catch things, it's mostly unlikely to catch anything it's not already catching.

if you mean to reduce flakiness: possibly, though I'm not sure if this fully resolves the flakiness in this. it did change it from "almost always fails" to "passed multiple times in a row" though, and these were clearly incorrect, so I think it's a true improvement in any case.

davidporter-id-au commented 1 year ago

racy tests: if you mean to let the race detector catch things, it's mostly unlikely to catch anything it's not already catching.

if you mean to reduce flakiness: possibly, though I'm not sure if this fully resolves the flakiness in this. it did change it from "almost always fails" to "passed multiple times in a row" though, and these were clearly incorrect, so I think it's a true improvement in any case.

oh yes, this is a strict fix, I was just wondering in the general case if we should increase certainty where there was a lot of concurrency involved, but, as you point out, that's what the race-detector does and I just forgot about it

Groxx commented 1 year ago

oh yes, this is a strict fix, I was just wondering in the general case if we should increase certainty where there was a lot of concurrency involved, but, as you point out, that's what the race-detector does and I just forgot about it

yea. and I don't think we have (detectable) data races in tests except for some cases where code does not shut down in time, and it produces a log after the test has completed (we have incorrect shutdowns all over, which is why a lot of tests can't be converted to use zaptest.NewLogger(t)). or at least all the ones I've seen have been just ^ that logger-race-due-to-shutdown.

any existing races that might exist just aren't being exercised in a way that reveals them to the race detector. we need more concurrent tests / more concurrency-using tests 🤷