Azure / durabletask

Durable Task Framework allows users to write long running persistent workflows in C# using the async/await capabilities.
Apache License 2.0
1.47k stars 287 forks source link

Main orchestration gets stuck even when suborchestration is completed #1084

Closed ankitmicrosoft closed 1 month ago

ankitmicrosoft commented 1 month ago

Hi,

We are using DTFx library for our Orchestration.

We have a main Orchestration that calls a sub orchestration for rollback. Here we are using timers to perform delays and timeouts. Please find below the suborchesration code.


Please find below the the PollAsyncMethod:

` public static async Task<LiftrActivityOutput> PollStepAsync(OrchestrationContext context, Type activityType, LiftrActivityInput activityInput, LiftrActivityOutput output, ILogger logger)
 {
     CancellationTokenSource cancellationToken = null;

     if (!context.IsReplaying)
     {
         cancellationToken = new CancellationTokenSource();
     }

     try
     {
         int pollInterval = activityInput.PollIntervalInSeconds;
         int maxPollingDuration = activityInput.TimeoutInSeconds;

         var startTime = context.CurrentUtcDateTime;

         // Timers for timeout and initial delay
         var timeoutTask = context.CreateTimer(startTime.AddSeconds(maxPollingDuration), cancellationToken.Token);
         await context.CreateTimer(context.CurrentUtcDateTime.AddSeconds(activityInput.InitialDelayInSeconds), cancellationToken.Token);

         while (true)
         {
             // Check if the timeout task has completed.
             if (timeoutTask.IsCompleted)
             {
                 logger.Information($"Timeout occured for step: {activityInput.Context.ToJsonString()}");
                 throw new TaskFailedException("Timeout occured for async operation", new TimeoutException($"Timedout while tracking the state {activityInput.ActivityName}"));
             }

             // Call the GetStatusActivity to check the status
             output = await context.ScheduleTask<LiftrActivityOutput>(activityType, activityInput);
             activityInput.Context = output.Context;
             activityInput.PollIntervalInSeconds = output.PollIntervalInSeconds;

             // Check if the status indicates completion.
             if (output.ActivityStatus != ActivityStatus.InProgress.ToString() || output.ActivityStatus == ActivityStatus.Failed.ToString())
             {
                 // Cancel the timeout task
                 cancellationToken.Cancel();
                 break;
             }

             // Wait before the next polling attemSpt.
             await context.CreateTimer(context.CurrentUtcDateTime.AddSeconds(activityInput.PollIntervalInSeconds), cancellationToken.Token);
         }

         cancellationToken.Cancel();
         return output;
     }
     catch (Exception ex)
     {
         cancellationToken.Cancel();
         logger.Error(ex, $"Error polling step: {activityInput.ActivityName}");
         throw;
     }
     finally
     {
         cancellationToken.Dispose();
     }
 }`

The main orchestator calls above orchestrator and even after suborchestrator completes it takes long time similar to timeout interval to re run.

Please help me here how I can make sure the after suborchestration completes, the main orchestrator is quickly picked up.
ankitmicrosoft commented 1 month ago

Was using wrong overriden method