temporalio / sdk-java

Temporal Java SDK
https://temporal.io
Apache License 2.0
220 stars 147 forks source link

Java SDK seems to not play well with Replay tests. Setting child workflow ID using Temporal library seems to be throwing a non-deterministic error #2057

Open gauravojha opened 6 months ago

gauravojha commented 6 months ago

Details attached on the temporal forum support request - https://community.temporal.io/t/workflowreplayer-replayworkflowexecution-throwing-nondeterministicexception-when-reading-workflow-id-from-workflow/12040/6

Copied from the post above

========

Here is a sample repo which replicates the error. The repo also has replay tests and sample workflow history if required under the src/test folder https://github.com/gauravojha/learningTemporal/tree/master

So what we are trying to do is

(Also mentioned here, hope this helps - learningTemporal/src/main/java/dev/ojha/learningtemporal/workflows/HelloWorldWorkflowImpl.java at master · gauravojha/learningTemporal · GitHub)

Quinn-With-Two-Ns commented 6 months ago

How are you replaying history? If you are using WorkflowReplayer.replayWorkflowExecution then when you construct WorkflowExecutionHistory you need to specify the workflow ID

gauravojha commented 6 months ago

How are you replaying history? If you are using WorkflowReplayer.replayWorkflowExecution then when you construct WorkflowExecutionHistory you need to specify the workflow ID

@Quinn-With-Two-Ns sorry, i am not sure i understand.. I am not doing anything complicated in my code if you see it. my workflow implementation is pretty simple, it only triggers a child workflow. I am not doing any special replaying in there. Any replay is being done in the replay test which is written using instructions at https://docs.temporal.io/dev-guide/java/durable-execution#add-replay-test . I don't see any specific instructions on providing a workflow ID here if i am providing a history extracted from temporal

This is the workflow implementation - https://github.com/gauravojha/learningTemporal/blob/652d2ba76e80b7d1a96946fae42b3d1dec0b3b3f/src/main/java/dev/ojha/learningtemporal/workflows/HelloWorldWorkflowImpl.java#L35-L40

This is the only place where any replay happens (in the WorkflowReplayer in a unit test) - https://github.com/gauravojha/learningTemporal/blob/652d2ba76e80b7d1a96946fae42b3d1dec0b3b3f/src/test/java/dev/ojha/learningtemporal/TemporalTesting.java#L45

From the forum question where this is originally posted, it seems to be a bug in the SDK? The repository I have attached is a very small repository which can be used to reproduce the error. The entire history is downloaded from temporal UI after a sample workflow got over from the above repro repo, so I think I am not constructing the history, it is a history which temporal gives me if I am not mistaken?

Quinn-With-Two-Ns commented 6 months ago

Where you are calling the replayer you are not specifying the workflow ID. You need to specificy a workflow ID when using the replayer since it is not always in history.

        assertDoesNotThrow(() -> WorkflowReplayer.replayWorkflowExecution(eventHistoryFile,
                HelloWorldWorkflowImpl.class));

needs to be something like

        assertDoesNotThrow(() -> WorkflowReplayer.replayWorkflowExecution(WorkflowExecutionHistory.WorkflowExecutionHistory(WorkflowHistoryLoader.readHistory(eventHistoryFile), "YOUR WORKFLOW ID"),
                HelloWorldWorkflowImpl.class));
gauravojha commented 6 months ago

Got it.. Let me try @Quinn-With-Two-Ns, and thank you for pointing in the right direction, that worked on the sample repo I shared above :) .. i couldnt find this recommendation in the starting replay test documentation. If this is a critical statement, I feel this should be mentioned in the intial stages in the document, since as a new user I may end up only looking at the first recommendation from the doc?

Just adding some more color to this. So we wrote tests for 10 different workflows, and only one of them failed. Rest 9 passed, and we didn't have to pass in a workflow ID to 9 which passed, as we didn't get to the point that we may have to specify the workflow ID. If the following recommendation by you would have been mentioned in the initial stags of the replay test document, it would have helped us keep an eye out for it. Hope this helps?

You need to specificy a workflow ID when using the replayer since it is not always in history

Quinn-With-Two-Ns commented 6 months ago

Yes I agree we should improve the documentation here. Passing the workflow ID is only important if the workflow uses the workflow ID as part of its logic which most don't.