uber-go / cadence-client

Framework for authoring workflows and activities running on top of the Cadence orchestration engine.
https://cadenceworkflow.io
MIT License
345 stars 131 forks source link

Implement bugfix for child-workflow bug in #1138 #1144

Closed Groxx closed 2 years ago

Groxx commented 2 years ago

The three commits in this PR are relevant for discussion, possibly also merging:

  1. "Demonstrate ExecuteChildWorkflow bug + prepare test for a fix" is #1138, unmodified.
  2. "[incomplete] Implement bugfix for child-workflow bug in #1138" is the "bare" bugfix, which I believe to be a complete fix... which is not backwards compatible because it changes behavior.
  3. "Backported" un-bugfix in case people hit non-determinism errors" is a controllable way to undo the bugfix, and documentation on how to use it.

We can decide whether or not the current and future complexity is worth commit 3, but I think it's... not too bad?
Other options include:

  1. Ship the fix without the backport.
    • Simpler now and in the future.
    • No remediation possible for users that break, except "reset to before bug, and/or terminate".
  2. Use GetVersion internally when the bug might be triggered, with the same logic as the field check here.
    • Simpler and invisible.
    • Not user controllable, so if they were relying on it, they'll break...... but maybe that's fine. It's not always worth fighting Hyrum's Law. If it is, we can also add commit 3's flag.
    • Requires recording that version marker for all future, non-buggy workflows, increasing size slightly.
  3. Start recording client version numbers in decision task histories, so we can access it during replay, like BinaryChecksum. Use "no version" to mean "contains bug" and act appropriately.
    • Honestly this is probably a good idea regardless, as it'd give us (and users) a lever / escape-hatch for dealing with version-specific bugs.
    • I'm not sure what this change will involve. Server + web changes maybe? If it's client-side only, I can build that instead.
    • This would behave the same as GetVersion, but (relatively) "free" from markers that a bug is being addressed, as well as for future bugs.
coveralls commented 2 years ago

Pull Request Test Coverage Report for Build 49c94db2-60ed-4e72-ab26-0a4e029d577a


Changes Missing Coverage Covered Lines Changed/Added Lines %
internal/workflow.go 24 37 64.86%
<!-- Total: 29 42 69.05% -->
Totals Coverage Status
Change from base Build f974e15f-4f41-4413-b0bb-e32f7328b7f4: 0.006%
Covered Lines: 12256
Relevant Lines: 19363

💛 - Coveralls