microsoft / durabletask-java

Java SDK for Durable Functions and the Durable Task Framework
MIT License
13 stars 7 forks source link

Orchestrator functions failing if not returning anything #126

Closed akshaykumars10 closed 1 year ago

akshaykumars10 commented 1 year ago

We have an eternal orchestrator function in our durable function app. This orchestrator function doesn’t need to return anything. However, with the void return type, the orchestrator function is failing.

Error: System.InvalidOperationException: The function invocation resulted in a null response. This means that either the orchestrator function was implemented incorrectly, the Durable Task language SDK was implemented incorrectly, or that the destination language worker is not sending the function result back to the host.

Function Code:

@FunctionName('EternalOrchestrator')
    public void startWorkflowSchedule(@DurableOrchestrationTrigger(name = 'ctx') TaskOrchestrationContext ctx) {

        LOGGER.info('Hello World');
        ctx.createTimer(Duration.ofSeconds(10)).await();
        ctx.continueAsNew(null);
    }

Function App Name: possmartpolling-java Instance Id: testeternal123 Please let me know if any other details are required.

cgillum commented 1 year ago

Hi @akshaykumars10. I noticed that your host.json file references a "Preview" build of extension bundles v4.

  "extensionBundle": {
    "id": "Microsoft.Azure.Functions.ExtensionBundle.Preview",
    "version": "[4.*, 5.0.0)"
  },

Do you encounter this same problem if you switch to using the non-preview (GA) version of the v4 extension bundles?

  "extensionBundle": {
    "id": "Microsoft.Azure.Functions.ExtensionBundle",
    "version": "[4.*, 5.0.0)"
  },
akshaykumars10 commented 1 year ago

Hi @cgillum I am getting the same error with the non-preview (GA) version of the v4 extension bundles. Here is the new instance id: testeternal1234

cgillum commented 1 year ago

Thanks @akshaykumars10 for confirming. A couple more questions to try and narrow down the problem further:

akshaykumars10 commented 1 year ago

Hi @cgillum

ChrisRomp commented 1 year ago

Seeing this also even with some basic sample code adapted from the docs. I still get a 202 Accepted response with the usual method object properties like statusQueryGetUri. Calling the statusQueryGetUri shows "runtimeStatus": "Failed".

(Edited to remove java sleep in favor of ctx.createTimer() with same result.)

package com.function;

import com.microsoft.azure.functions.annotation.*;
import com.microsoft.azure.functions.*;

import java.time.Duration;

import com.microsoft.durabletask.*;
import com.microsoft.durabletask.azurefunctions.DurableClientContext;
import com.microsoft.durabletask.azurefunctions.DurableClientInput;
import com.microsoft.durabletask.azurefunctions.DurableOrchestrationTrigger;

public class EternalOrchTest {
    @FunctionName("Trigger_Eternal_Orchestration")
    public HttpResponseMessage triggerEternalOrchestration(
            @HttpTrigger(name = "req", methods = {HttpMethod.GET, HttpMethod.POST}, authLevel = AuthorizationLevel.ANONYMOUS) HttpRequestMessage<?> req,
            @DurableClientInput(name = "durableContext") DurableClientContext durableContext) {

        String instanceID = "StaticID";
        DurableTaskClient client = durableContext.getClient();
        client.scheduleNewOrchestrationInstance("Periodic_Cleanup_Loop", null, instanceID);
        return durableContext.createCheckStatusResponse(req, instanceID);
    }

    @FunctionName("Periodic_Cleanup_Loop")
    public void periodicCleanupLoop(
            @DurableOrchestrationTrigger(name = "ctx") TaskOrchestrationContext ctx) {

        ctx.createTimer(Duration.ofHours(1)).await();
        ctx.continueAsNew(null);
    }
}
GET http://localhost:7071/api/Trigger_Eternal_Orchestration HTTP/1.1
User-Agent: vscode-restclient
accept-encoding: gzip, deflate

HTTP/1.1 202 Accepted
Connection: close
Content-Type: application/json; charset=utf-8
Date: Wed, 05 Apr 2023 21:34:52 GMT
Server: Kestrel
Location: http://localhost:7071/runtime/webhooks/durabletask/instances/StaticID?code={systemKey}
Transfer-Encoding: chunked

{
  "id": "StaticID",
  "purgeHistoryDeleteUri": "http://localhost:7071/runtime/webhooks/durabletask/instances/StaticID?code={systemKey}",
  "sendEventPostUri": "http://localhost:7071/runtime/webhooks/durabletask/instances/StaticID/raiseEvent/{eventName}?code={systemKey}",
  "statusQueryGetUri": "http://localhost:7071/runtime/webhooks/durabletask/instances/StaticID?code={systemKey}",
  "terminatePostUri": "http://localhost:7071/runtime/webhooks/durabletask/instances/StaticID/terminate?reason={text}&code={systemKey}"
}
Executing task: func host start 

Azure Functions Core Tools
Core Tools Version:       4.0.5095 Commit hash: N/A  (64-bit)
Function Runtime Version: 4.16.5.20396

[2023-04-05T21:34:44.184Z] Listening for transport dt_socket at address: 5005

Functions:

        Trigger_Eternal_Orchestration: [GET,POST] http://localhost:7071/api/Trigger_Eternal_Orchestration

        DoCleanup: activityTrigger

        Periodic_Cleanup_Loop: orchestrationTrigger

For detailed output, run func with --verbose flag.
[2023-04-05T21:34:45.204Z] Worker process started and initialized.
[2023-04-05T21:34:49.005Z] Host lock lease acquired by instance ID '0000000000000000000000006A2E027E'.
[2023-04-05T21:34:53.073Z] Executing 'Functions.Trigger_Eternal_Orchestration' (Reason='This function was programmatically called via the host APIs.', Id=6351548a-4fe3-42cf-b996-a0e7f4fc11a2)
[2023-04-05T21:34:53.770Z] Function "Trigger_Eternal_Orchestration" (Id: 6351548a-4fe3-42cf-b996-a0e7f4fc11a2) invoked by Java Worker
[2023-04-05T21:34:53.820Z] Executing 'Functions.Periodic_Cleanup_Loop' (Reason='(null)', Id=70a7b720-247b-43b1-b8fb-b0cfab96df57)
[2023-04-05T21:34:53.839Z] Executed 'Functions.Trigger_Eternal_Orchestration' (Succeeded, Id=6351548a-4fe3-42cf-b996-a0e7f4fc11a2, Duration=784ms)
[2023-04-05T21:34:53.867Z] Function "Periodic_Cleanup_Loop" (Id: 70a7b720-247b-43b1-b8fb-b0cfab96df57) invoked by Java Worker
[2023-04-05T21:34:53.884Z] Executed 'Functions.Periodic_Cleanup_Loop' (Failed, Id=70a7b720-247b-43b1-b8fb-b0cfab96df57, Duration=78ms)
[2023-04-05T21:34:53.884Z] System.Private.CoreLib: Exception while executing function: Functions.Periodic_Cleanup_Loop. Microsoft.Azure.WebJobs.Extensions.DurableTask: The function invocation resulted in a null response. This means that either the orchestrator function was implemented incorrectly, the Durable Task language SDK was implemented incorrectly, or that the destination language worker is not sending the function result back to the host.
[2023-04-05T21:34:53.887Z] StaticID: Function 'Periodic_Cleanup_Loop (Orchestrator)' failed with an error. Reason: Microsoft.Azure.WebJobs.Host.FunctionInvocationException: Exception while executing function: Functions.Periodic_Cleanup_Loop
[2023-04-05T21:34:53.887Z]  ---> System.InvalidOperationException: The function invocation resulted in a null response. This means that either the orchestrator function was implemented incorrectly, the Durable Task language SDK was implemented incorrectly, or that the destination language worker is not sending the function result back to the host.
[2023-04-05T21:34:53.887Z]    at Microsoft.Azure.WebJobs.Extensions.DurableTask.OutOfProcMiddleware.<>c__DisplayClass10_0.<<CallOrchestratorAsync>b__0>d.MoveNext() in D:\a\_work\1\s\src\WebJobs.Extensions.DurableTask\OutOfProcMiddleware.cs:line 127
[2023-04-05T21:34:53.887Z] --- End of stack trace from previous location ---
[2023-04-05T21:34:53.887Z]    at Microsoft.Azure.WebJobs.Host.Executors.TriggeredFunctionExecutor`1.<>c__DisplayClass7_0.<<TryExecuteAsync>b__0>d.MoveNext() in C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\TriggeredFunctionExecutor.cs:line 50
[2023-04-05T21:34:53.888Z] --- End of stack trace from previous location ---
[2023-04-05T21:34:53.888Z]    at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.InvokeWithTimeoutAsync(IFunctionInvoker invoker, ParameterHelper parameterHelper, CancellationTokenSource timeoutTokenSource, CancellationTokenSource functionCancellationTokenSource, Boolean throwOnTimeout, TimeSpan timerInterval, IFunctionInstance instance) in C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 581
[2023-04-05T21:34:53.888Z]    at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.ExecuteWithWatchersAsync(IFunctionInstanceEx instance, ParameterHelper parameterHelper, ILogger logger, CancellationTokenSource functionCancellationTokenSource) in C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 527
[2023-04-05T21:34:53.888Z]    at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.ExecuteWithLoggingAsync(IFunctionInstanceEx instance, FunctionStartedMessage message, FunctionInstanceLogEntry instanceLogEntry, ParameterHelper parameterHelper, ILogger logger, CancellationToken cancellationToken) in C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 306
[2023-04-05T21:34:53.888Z]    --- End of inner exception stack trace ---
[2023-04-05T21:34:53.888Z]    at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.ExecuteWithLoggingAsync(IFunctionInstanceEx instance, FunctionStartedMessage message, FunctionInstanceLogEntry instanceLogEntry, ParameterHelper parameterHelper, ILogger logger, CancellationToken cancellationToken) in C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 352
[2023-04-05T21:34:53.888Z]    at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.TryExecuteAsync(IFunctionInstance functionInstance, CancellationToken cancellationToken) in C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 108. IsReplay: False. State: Failed. HubName: JavaTestHub. AppName: . SlotName: . ExtensionVersion: 2.9.0. SequenceNumber: 4. TaskEventId: -1
cgillum commented 1 year ago

@kaibocai can you help find an owner for this item to help investigate? The problem seems to be that the orchestration middleware in the Java SDK is not returning a value back to the Durable Functions extension. That's basically what this error message means:

System.InvalidOperationException: The function invocation resulted in a null response. This means that either the orchestrator function was implemented incorrectly, the Durable Task language SDK was implemented incorrectly, or that the destination language worker is not sending the function result back to the host.

kaibocai commented 1 year ago

It's an edge case I forgot to consider about when support durable function on azure function java. I have created a PR for the fix https://github.com/Azure/azure-functions-java-worker/pull/711, will release the java worker asap. Thanks.

akshaykumars10 commented 1 year ago

Hi @kaibocai How do I have this fixed in my function apps? Do I need to specify the latest version of azure-functions-java-worker in pom.xml?

kaibocai commented 1 year ago

Hi @kaibocai How do I have this fixed in my function apps? Do I need to specify the latest version of azure-functions-java-worker in pom.xml?

Hi @akshaykumars10 , the java worker is part of the azure function plantform that you functions are running on, it's not a dependency that should be included in your function pom file. Currently we are in the process of azure functions V4.21.1 release, which contains the java worker that has the fix for this issue.

You can view the version of your functions from portal, for example

image

Once the runtime version is upgrade to 4.21.1, you shouldn't have this issue anymore. We are in the release process of V4.21.1, it should be completed in next week (which means sometime next week your runtime version will automatically bump up to 4.21.1). I will keep you updated once it's released to all regions. At the meantime, you can also track it from the above picture. Thanks.