uber / cadence-java-client

Java framework for Cadence Workflow Service
https://cadenceworkflow.io
Other
143 stars 106 forks source link

Improve non deterministic error when workflow fail and applying signal events #654

Open longquanzheng opened 2 years ago

longquanzheng commented 2 years ago
qlong@~: $cadence  wf query ....
Error: Query workflow failed.
Error Details: QueryFailedError{Message: java.lang.IllegalStateException: Signal received after workflow is closed.
    at com.uber.cadence.internal.replay.ReplayDecider.handleWorkflowExecutionSignaled(ReplayDecider.java:384)
    at com.uber.cadence.internal.replay.ReplayDecider.processEvent(ReplayDecider.java:206)
    at com.uber.cadence.internal.replay.ReplayDecider.decideImpl(ReplayDecider.java:472)
    at com.uber.cadence.internal.replay.ReplayDecider.query(ReplayDecider.java:619)
    at com.uber.cadence.internal.replay.ReplayDecisionTaskHandler.processQuery(ReplayDecisionTaskHandler.java:219)
    at com.uber.cadence.internal.replay.ReplayDecisionTaskHandler.handleDecisionTaskImpl(ReplayDecisionTaskHandler.java:123)
    at com.uber.cadence.internal.replay.ReplayDecisionTaskHandler.handleDecisionTask(ReplayDecisionTaskHandler.java:86)
    at com.uber.cadence.internal.worker.WorkflowWorker$TaskHandlerImpl.handle(WorkflowWorker.java:213)
    at com.uber.cadence.internal.worker.WorkflowWorker$TaskHandlerImpl.handle(WorkflowWorker.java:185)
    at com.uber.cadence.internal.worker.PollTaskExecutor.lambda$process$0(PollTaskExecutor.java:71)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
}
('export CADENCE_CLI_SHOW_STACKS=1' to see stack traces)
longquanzheng commented 2 years ago

Turns out this is not a bug. This is because of a non deterministic code change -- adding an activity in the workflow without using versioning. So during workflow replay fails with exception which marks the workflow as closed. As a result, applying signal fails at this error.

However, we should improve this error to make it more obvious about the Non deterministic error.