microsoft / durabletask-java

Java SDK for Durable Functions and the Durable Task Framework
MIT License
13 stars 7 forks source link

Bug in functioning of continueAsNew() function in eternal orchestration #118

Closed kanupriya15025 closed 1 year ago

kanupriya15025 commented 1 year ago

According to the documentation here (https://github.com/microsoft/durabletask-java/blob/6d6dff9bbbe145f561cde05ff7ab065530279073/README.md#eternal-monitoring-orchestration) , if continueAsNew is called, the flow should break and the a new execution of the eternal orchestrator should be called. However, that's not what I am seeing here in my code :

        if (winnerEvent == patchEvent) {
            jsonBody = patchEvent.await();
            if (!ctx.getIsReplaying()) {
                LOGGER.info("Payload received for PATCH call: {}", jsonBody);
                ctx.continueAsNew(jsonBody);
            }
        }

        LOGGER.info("The control passed here.");

In the above code my expectation is that as soon as ctx.continueAsNew() is called, the flow should not be passed to the second log line. But this is what is what I see :

        Payload received for PATCH call: <>
        The control passed here.

My expectation is that if winnerEvent is PatchEvent, I should only see log line "Payload received for PATCH call:" and not "The control passed here"

Is this expectation correct?

davidmrdavid commented 1 year ago

Hi @kanupriya15025:

Given that continueAsNew is not an "awaited" API, we normally cannot stop the function from continuing to execute any lines of code directly below it. However, on the next opportunity where execution control is yielded back to the SDK (i.e an API is "awaited" or the code returns) we'll execute the continueAsNew command.

I say that normally we cannot stop the function from continuing to execute because I'm not aware if the Durable Functions Java SDK design may have something special that allows it to break the control flow immediately when continueAsNew is called. @kaibocai - let's sync about this offline.

cgillum commented 1 year ago

@kanupriya15025 for now, the correct thing to do is to return immediately after your call to continueAsNew().

ChrisRomp commented 1 year ago

FWIW, I'm still getting the exception even after a return statement from a void method. This sample was taken from our docs.

Code:

@FunctionName("Periodic_Cleanup_Loop")
public void periodicCleanupLoop(
        @DurableOrchestrationTrigger(name = "ctx") TaskOrchestrationContext ctx) {

    ctx.createTimer(Duration.ofHours(1)).await();
    ctx.continueAsNew(null);
    return;
}

API call:

POST http://localhost:7071/runtime/webhooks/durabletask/orchestrators/{{functionName}}/{{instanceId}}?code={{systemKey}} HTTP/1.1
Content-Type: application/json

Exception:

[2023-04-21T18:22:31.293Z] Executing 'Functions.Periodic_Cleanup_Loop' (Reason='(null)', Id=cf9a3d8e-983b-4b6b-b3a5-561796a5ed24)
[2023-04-21T18:22:31.464Z] Function "Periodic_Cleanup_Loop" (Id: cf9a3d8e-983b-4b6b-b3a5-561796a5ed24) invoked by Java Worker
[2023-04-21T18:22:31.491Z] Executed 'Functions.Periodic_Cleanup_Loop' (Failed, Id=cf9a3d8e-983b-4b6b-b3a5-561796a5ed24, Duration=238ms)
[2023-04-21T18:22:31.491Z] System.Private.CoreLib: Exception while executing function: Functions.Periodic_Cleanup_Loop. Microsoft.Azure.WebJobs.Extensions.DurableTask: The function invocation resulted in a null response. This means that either the orchestrator function was implemented incorrectly, the Durable Task language SDK was implemented incorrectly, or that the destination language worker is not sending the function result back to the host.
[2023-04-21T18:22:31.504Z] adbf5d2c-d21c-4d55-a25c-1f2defd375d2: Function 'Periodic_Cleanup_Loop (Orchestrator)' failed with an error. Reason: Microsoft.Azure.WebJobs.Host.FunctionInvocationException: Exception while executing function: Functions.Periodic_Cleanup_Loop
[2023-04-21T18:22:31.504Z]  ---> System.InvalidOperationException: The function invocation resulted in a null response. This means that either the orchestrator function was implemented incorrectly, the Durable Task language SDK was implemented incorrectly, or that the destination language worker is not sending the function result back to the host.
[2023-04-21T18:22:31.504Z]    at Microsoft.Azure.WebJobs.Extensions.DurableTask.OutOfProcMiddleware.<>c__DisplayClass10_0.<<CallOrchestratorAsync>b__0>d.MoveNext() in D:\a\_work\1\s\src\WebJobs.Extensions.DurableTask\OutOfProcMiddleware.cs:line 127
kaibocai commented 1 year ago

Hi @ChrisRomp , the V2.11.0 azure function java worker (Host version V4.21.0) is still in the release process, which may take up to 3 weeks to complete. Once it's fully released, I will update you. Thanks.

kaibocai commented 1 year ago

Hi @ChrisRomp , the V2.11.0 azure function java worker (Host version V4.21.0) is still in the release process, which may take up to 3 weeks to complete. Once it's fully released, I will update you. Thanks.

Host team is having a hotfix 4.21.1 due to some regression found in 4.21.0. V4.21.1 should be fully released in next two weeks. Sorry for the inconvenience.

ChrisRomp commented 1 year ago

FYI @kanupriya15025

kanupriya15025 commented 1 year ago

@cgillum @kaibocai Doesn't a return after continueAsNew() stop the execution of the function's next execution? If I want to look at it intuitively, continueAsNew() should internally return and start a new execution. Instead, of the users explicitly specifying a return statement. Because the return statement gives a hint that the function is returning and there's no continuation of the function after that.

kanupriya15025 commented 1 year ago

@cgillum @kaibocai Any updates on this?

kaibocai commented 1 year ago

FWIW, I'm still getting the exception even after a return statement from a void method. This sample was taken from our docs.

Code:

@FunctionName("Periodic_Cleanup_Loop")
public void periodicCleanupLoop(
        @DurableOrchestrationTrigger(name = "ctx") TaskOrchestrationContext ctx) {

    ctx.createTimer(Duration.ofHours(1)).await();
    ctx.continueAsNew(null);
    return;
}

API call:

POST http://localhost:7071/runtime/webhooks/durabletask/orchestrators/{{functionName}}/{{instanceId}}?code={{systemKey}} HTTP/1.1
Content-Type: application/json

Exception:

[2023-04-21T18:22:31.293Z] Executing 'Functions.Periodic_Cleanup_Loop' (Reason='(null)', Id=cf9a3d8e-983b-4b6b-b3a5-561796a5ed24)
[2023-04-21T18:22:31.464Z] Function "Periodic_Cleanup_Loop" (Id: cf9a3d8e-983b-4b6b-b3a5-561796a5ed24) invoked by Java Worker
[2023-04-21T18:22:31.491Z] Executed 'Functions.Periodic_Cleanup_Loop' (Failed, Id=cf9a3d8e-983b-4b6b-b3a5-561796a5ed24, Duration=238ms)
[2023-04-21T18:22:31.491Z] System.Private.CoreLib: Exception while executing function: Functions.Periodic_Cleanup_Loop. Microsoft.Azure.WebJobs.Extensions.DurableTask: The function invocation resulted in a null response. This means that either the orchestrator function was implemented incorrectly, the Durable Task language SDK was implemented incorrectly, or that the destination language worker is not sending the function result back to the host.
[2023-04-21T18:22:31.504Z] adbf5d2c-d21c-4d55-a25c-1f2defd375d2: Function 'Periodic_Cleanup_Loop (Orchestrator)' failed with an error. Reason: Microsoft.Azure.WebJobs.Host.FunctionInvocationException: Exception while executing function: Functions.Periodic_Cleanup_Loop
[2023-04-21T18:22:31.504Z]  ---> System.InvalidOperationException: The function invocation resulted in a null response. This means that either the orchestrator function was implemented incorrectly, the Durable Task language SDK was implemented incorrectly, or that the destination language worker is not sending the function result back to the host.
[2023-04-21T18:22:31.504Z]    at Microsoft.Azure.WebJobs.Extensions.DurableTask.OutOfProcMiddleware.<>c__DisplayClass10_0.<<CallOrchestratorAsync>b__0>d.MoveNext() in D:\a\_work\1\s\src\WebJobs.Extensions.DurableTask\OutOfProcMiddleware.cs:line 127

Hi @ChrisRomp , @kanupriya15025, this issue will be resolved with host verion 4.21 and above plus with this Fix https://github.com/microsoft/durabletask-java/pull/139 on next sdk release. We will prioritize the release. Thanks.

@kanupriya15025 regards your questions on continueAsNew() let me update you soon. Thank you.

kaibocai commented 1 year ago

@cgillum @kaibocai Doesn't a return after continueAsNew() stop the execution of the function's next execution? If I want to look at it intuitively, continueAsNew() should internally return and start a new execution. Instead, of the users explicitly specifying a return statement. Because the return statement gives a hint that the function is returning and there's no continuation of the function after that.

@kanupriya15025 yes, what you said totally make sense. Current we don't have this pattern support in Java, so for now please just directly return after continueAsNew(). The team will discuss on this and try to make this possible for java sdk. Thank you.

kaibocai commented 1 year ago

RP https://github.com/microsoft/durabletask-java/pull/139 for the fix already merged. It will be in next sdk release.

kaibocai commented 1 year ago

reopen the issue, cause right now the fix is waiting for the release. Once the release is done will close it.

lilyjma commented 1 year ago

@kanupriya15025 - this fix has been released in 1.1.1: https://repo.maven.apache.org/maven2/com/microsoft/durabletask-azure-functions/. This requires host 4.21+

kaibocai commented 1 year ago

checkout the reply here https://github.com/microsoft/durabletask-java/issues/138#issuecomment-1607418282

kaibocai commented 1 year ago

Close this as the issue is confirmed to be resolved at https://github.com/microsoft/durabletask-java/issues/138#issuecomment-1611078651

kaibocai commented 12 months ago

Doesn't a return after continueAsNew() stop the execution of the function's next execution? If I want to look at it intuitively, continueAsNew() should internally return and start a new execution. Instead, of the users explicitly specifying a return statement. Because the return statement gives a hint that the function is returning and there's no continuation of the function after that.

@kanupriya15025, we have just release new java SDK v1.2.0 which has improve the code pattern for using continueAsNew, now it doesn't require you to add a return statement right after it. When using continueAsNew, internally return and start a new execution. Please try it out.