temporalio / sdk-go

Temporal Go SDK
https://docs.temporal.io/application-development?lang=go
MIT License
537 stars 212 forks source link

Extra replay command when replaying "partial histories" #1670

Open RamyElkest opened 3 weeks ago

RamyElkest commented 3 weeks ago

This is more of a request for information than a bug.

Expected Behavior

Replaying a downloaded workflow history ending with workflow task (scheduled/started) should not fail with a [TMPRL1100] nondeterministic workflow: extra replay command

Actual Behavior

Replaying a downloaded workflow history ending with workflow task (scheduled/started) fails with an [TMPRL1100] nondeterministic workflow: extra replay command

Steps to Reproduce the Problem

Reproducing test and detailed explanation: https://github.com/RamyElkest/sdk-go/pull/1

Specifications

RamyElkest commented 3 weeks ago

Solution The proposed solution here is to trim scheduled/started/completed workflow tasks with no follow-up events, this guarantees the workflow history is in a safely replayable state. For this there are three approaches:

  1. Trim the history in GetWorkflowHistory (to be discussed with upstream)
  2. Trim the history in our code before passing it to the Replayer
  3. Trim the history in the Replayer (to be discussed with upstream)

Curious if you have any thoughts / preferences here.

cretz commented 2 weeks ago

Thanks for the report! Will confer with the team on replaying of mid-task history captures. While it makes sense to only replay up to the last completed or failed task, we may need to double check that people aren't running replays on the active task without the task failure to replicate failures (e.g. to replicate deadlock detection).

cretz commented 2 weeks ago

Conferred with team, we consider this a bug. If we are in fact failing a replay with history that should succeed, we need to fix. It is likely we should not be performing history matching for non-determinism checks after the last task start (that doesn't have an end). This issue will be updated when we have a solution.