Azure / durabletask

Durable Task Framework allows users to write long running persistent workflows in C# using the async/await capabilities.
Apache License 2.0
1.47k stars 287 forks source link

Allow TaskHubClient's CreateOrchestrationInstanceAsync to schedule orchestrators #1107

Closed davidmrdavid closed 1 month ago

davidmrdavid commented 1 month ago

Related: https://github.com/Azure/azure-functions-durable-extension/pull/2805

In the PR above, we learned if a user wants to schedule an orchestrator to start a particular time, they need to use the CreateScheduledOrchestrationInstanceAsync instead of using the more commonly used CreateOrchestrationInstanceAsync API.

Allowing CreateOrchestrationInstanceAsync to also schedule orchestrators by accepting a new fireAt would simplify things downstream (in durabletask-dotnet, it would simplify the PR linked above^) so this PR adds that "missing" parameter. This does create some repetition (there are now two APIs that allow users to schedule orchestrators) but I think it makes the orchestrator-scheduling feature more discoverable while also helping simplify downstream code.

This PR also include 2 additional minor contributions: 1) It enables nullable analysis in TaskHubClient, and performs the minimal edits needed to get the analysis to pass. 2) It adds a simple TaskHubClient test validating that the class is able to schedule orchestrators to fire at a given time.

cgillum commented 1 month ago

If another method already exists that implements the scheduling behavior, why is this PR needed? In other words, why can’t we have the consuming SDK use the existing method? That wasn’t clear in this PR description and I’m hesitant to agree to adding a redundant method overload.

davidmrdavid commented 1 month ago

Yeah I didn't make this clear. I see two main reasons to make this change, but please do push back if needed:

First - The overloads for CreateOrchestrationInstanceAsync accept a strictly larger number of parameters than CreateScheduledOrchestrationInstanceAsync. Therefore, durabletask-dotnet can't simply call the CreateScheduledOrchestrationInstanceAsync without possibly dropping parameters like dedupStatuses and tags.

For evidence: if we do a CTRL+F for "Task CreateScheduledOrchestrationInstanceAsync" in TaskHubClient you'll see that there's only 3 matches:

(A.1) https://github.com/Azure/durabletask/blob/7a778c3bca8bb4e33e9f19e3ffae2cd003fe0cce/src/DurableTask.Core/TaskHubClient.cs#L83-L86

(A.2) https://github.com/Azure/durabletask/blob/7a778c3bca8bb4e33e9f19e3ffae2cd003fe0cce/src/DurableTask.Core/TaskHubClient.cs#L108-L113

(A.3) https://github.com/Azure/durabletask/blob/7a778c3bca8bb4e33e9f19e3ffae2cd003fe0cce/src/DurableTask.Core/TaskHubClient.cs#L135

Meanwhile, a CTRL+F for Task<OrchestrationInstance> CreateOrchestrationInstanceAsync yields 9 results, too many to list here. At it's largest parameter set, this method accepts up to 7 parameters, compared to the 5 parameters listed in (A.3), see here and notice the missing params relative to (A.3):

https://github.com/Azure/durabletask/blob/7a778c3bca8bb4e33e9f19e3ffae2cd003fe0cce/src/DurableTask.Core/TaskHubClient.cs#L344-L351

So, it seems to me that there's a parameter-parity gap between the two APIs that favors CreateOrchestrationInstanceAsync. That makes me think it would be better, maintenance-wise, for CreateOrchestrationInstanceAsync to fully subsume the behavior of CreateScheduledOrchestrationInstanceAsync by giving it that missing (fireAt parameter). The alternative is giving CreateScheduledOrchestrationInstanceAsync more overloads.

Second - CreateOrchestrationInstanceAsync can actually already schedule orchestrations, see the last parameter of this overload for evidence: https://github.com/Azure/durabletask/blob/7a778c3bca8bb4e33e9f19e3ffae2cd003fe0cce/src/DurableTask.Core/TaskHubClient.cs#L175

So it seems to me that CreateOrchestrationInstanceAsync was already intended to subsume CreateScheduledOrchestrationInstanceAsync, but that we're missing an overload that allows us to set fireAt in addition to all the other parameters (the 7 I mentioned above). This PR adds that missing overload.

Hope that makes sense, sorry for the wall of text. In short - I think the two APIs already have overlapping responsibilities, and I'm just trying to give the API with more parameters another overload to have it fully subsume the other one.

davidmrdavid commented 1 month ago

Ultimately though, I don't feel strongly. Just think this makes sense in the long run.

cgillum commented 1 month ago

@davidmrdavid got it, the two reasons you called out have eased my concerns. I'm good with these changes. Thanks for explaining!