temporalio / sdk-java

Temporal Java SDK
https://temporal.io
Apache License 2.0
200 stars 134 forks source link

Investigate bad `isReplaying` value on direct query when workflow is not in cache #2016

Closed Quinn-With-Two-Ns closed 1 month ago

Quinn-With-Two-Ns commented 3 months ago

Expected Behavior

isReplaying() is true when replaying and false when not replaying

Actual Behavior

isReplaying() is false when the workflow is replaying for a query

Steps to Reproduce the Problem

1.Start a workflow that immediately waits for a signal 2.Restart the worker running the above workflow 3.Send a query to the workflow

Specifications

Quinn-With-Two-Ns commented 3 months ago

Auditing the replay code

    if (replaying
        && !hasNextEvent
        && (event.getEventType() == EventType.EVENT_TYPE_WORKFLOW_TASK_STARTED
            || WorkflowExecutionUtils.isWorkflowTaskClosedEvent(event))) {
      replaying = false;
    }

https://github.com/temporalio/sdk-java/blob/bc726c9ad4646fc5d14e91bcea11190ff7537ca0/temporal-sdk/src/main/java/io/temporal/internal/statemachines/WorkflowStateMachines.java#L427

This check is flawed since if we receive a query after a workflow task that didn't generate a command will cause the Java SDK to incorrectly assume the workflow is no longer replaying.