Azure / azure-functions-durable-extension

Durable Task Framework extension for Azure Functions
MIT License
714 stars 270 forks source link

OrchestrationTrigger: Run On Startup? #1536

Open bendursley opened 4 years ago

bendursley commented 4 years ago

I'd like to be able setup some eternal orchestrations without first having to call the function, similar to how the TimerTrigger works.

I'm looking at options for some cleanup jobs which I'd like to complete and then have x time before they run again (similar to the docs example for external orcehstration). That pattern seems to work quite well with what I'd like to do.

However, I'd like to avoid having to send a client request to the function first (HTTP Request, service bus message, etc), I'd just like it start up straight away.

olitomlinson commented 4 years ago

@bendursley could you use a TimerTrigger to invoke your clean up task on a desired schedule?

bendursley commented 4 years ago

@olitomlinson Yeah, that's my current option (and what I think I'll implement). The eternal orchestration pattern is neater, but it needs a kick start, which is a shame.

olitomlinson commented 4 years ago

@bendursley Hmmmm! Are you looking at the Eternal Orchestration pattern because your clean-up task needs to maintain some form of context/state/history between invocations?

bendursley commented 4 years ago

@olitomlinson Was intending to run a number of jobs which clean up some data, which could take some time to run, so the option to let it finish, then add y minutes upon completion seemed a neater idea than just run every y minutes. But don't necessarily need the state to be carried over.

olitomlinson commented 4 years ago

@bendursley Ahh okay I get you! So you would create a timer just before calling continueAsNew and the timer would represent your delay until the next clean-up run?

I'll have a think to see if there are any other patterns that come to mind that might help.

bendursley commented 4 years ago

@olitomlinson Yup, that was the intention, the timer would be the delay. CreateTimer then ContinueAsNew đź‘Ť

Thanks!

cgillum commented 3 years ago

I think this sounds like an interesting feature. One question would be how to handle failures. Should the orchestration automatically restart itself if there is an unhandled exception?

bendursley commented 3 years ago

@cgillum I think that would make sense to restart itself in x time, yes. So long as it the error was logged

tmenier commented 3 years ago

Glad this is being discussed. I have the same need - fire off an "eternal" orchestration on startup. In my mind, I'm wondering if maybe there are actually 2 distinct ideas that could bring this (and other scenarios) to fruition:

  1. A "run once at startup" trigger for regular Azure Functions. It would basically work like a TimerTrigger with runOnStartup=true but no actual timer functionality. I have a case where I'd like to send off an alert when a deployment happens and this would work great for that. I understand this should be suggested in a different repo though.

  2. With Durable Functions specifically, sometimes trigger > regular function > orchestrator function just feels like more indirection than should be necessary. Even if it were just an abstraction that makes it appear that an orchestration function is using one of the standard triggers directly, it seems like it would simplify app code in a lot of cases.

sebastianburckhardt commented 3 years ago

One thing to be careful about is the meaning of "on startup" which is a bit fuzzy as the state of durable functions (i.e. the taskhub state) outlives startup and shutdown cycles. For example, would a scale-from-zero situation count as a startup?

Another interpretation is to specify that an orchestrator should be executed once in a fresh taskhub; essentially we could create the start message together with the taskhub.

cgillum commented 3 years ago

Indeed, "on startup" is a little fuzzy in Azure Functions since there is no concept of "startup" at the app level - rather, each individual function app instance has its own startup routine. I think the way to implement this would be to checks to see if the instance has already started each time a worker instance starts up (and start the orchestration if it's not already started). There is a question too, however, about what to do if the instance is in a failed, completed, or terminated state.

I think the problem with creating a start message together with the task hub is that a startup orchestration could be added to the app sometime after the task hub is initially created. Task hub creation only happens the first time the app is ever started.

olitomlinson commented 3 years ago

@Bartolomeus-649

I think the vast majority of what you've described can be achieved by composing what already exists today in the Durable Functions and Entities API.

stateless

By shaping your workload into the Eternal Orchestration pattern, you can take advantage of ContinueAsNew(null) - no state is maintained between executions.

singleton

The singleton pattern already exists in Durable Functions. However, It is a bit boilerplate-y as you have to check for statuses etc, but recent fixes have addressed some of the edge cases where the Singleton guarantee was broken.

Always active

Once activated an Eternal Orchestration can loop indefinitely via ContinueAsNew . You can programmatically pause/resume the Function using DurableTimers to what ever schedule you can represent with code.

Exception Handling

You are free (and encouraged) to handle your exceptions within the Orchestration logic, and deploy what ever compensatory actions are necessary. Coupling this with DurableTimers you are able to code all kinds of back-off strategies.

Exclusive Resource Dependency Definition

You could enforce this through some logical ownership. I have done similar by using Durable Entities where each entity acts a 'Proxy' to a well-known resource (regardless of if the resource is an ARM resource or an actual API). The proxy owner is determined by First Come First Serve semantics where a potential owner calls a Lock(string owner) method on the proxy Entity, and subsequently releases its lock via an Unlock(string owner) method.

I'm not an expert, but doing this at the actual ARM level would be a huge challenge I think. You might be able to get somewhere near this using a combination of Managed Identities at the Function App level, but I imagine this wouldn't be granular enough for your use case.

A combination of Orchestration Singletons and appropriate ID conventions for the Orchestration Instance ID would give you that 'single concurrency' semantic that you desire - But lets not forget that Durable Entities and Orchestrations have exclusive access (provided by single concurrency guarantees) to state, so you're already 95% there by choosing the DF framework.

Job Progression

This can be covered by updating the customStatus property of an Orchestration and is a common use-case.

Valid execution time window

Once again, using DurableTimers and Durable Entity Reminders you can execute to a compiled or dynamic schedule - My team did exactly this for a Durable Functions workflow engine that needed to obey timing rules to match our customers working hours.

AI based anomalie detection

Application Insights (and also Azure Stream Analytics) are the places where you would develop these kind of algorithmic/AI insights from a PaaS perspective. I get the need, but Azure Functions and its runtime simply has no awareness of what "good looks like" for your use-cases.

I would advise getting to a point where your code raises the necessary metrics and health 'signals', which can be tracked and monitored in an appropriate product such as App Insights or ASA or any of the other first-class Azure ML technologies.

Recurrence Rule

Similar to the execution time window, the API surface is already there to do this.

What you need is a helper library that can do the heavy lifting of packing and unpacking a complex schedule, and then integrate this with the Timer and Reminders API of Durable Functions. FYI we rolled our own scheduling instruction set as we found the Recurrence Rule set to be way to complex, and we could get by with a minimal set of instructions. Of course your mileage may vary here.

However I do agree that Durable Entities supporting Recurrence Rule instruction set for signals, would be a neat addition.

olitomlinson commented 3 years ago

@Bartolomeus-649

I’m just saying there are robust APIs and common patterns/concepts in Durable Functions that can achieve many of the things you’re asking for, so it may be worth taking a look at those if you haven’t already done so.

If those APIs are not useful or fall short for your use-cases, then that would be really great feedback!

olitomlinson commented 3 years ago

@Bartolomeus-649

Of course, and as Chris and Seb have both mentioned, there are challenges to this “Start-up” trigger notion.

Until the start-up trigger is available, I’m providing alternatives that may help you.

I’m also providing guidance on how you can accomplish some of those things with what exists today in the API.

If you do not wish to take that guidance that’s absolutely fine.

cgillum commented 3 years ago

The concept of a background job is interesting, but I feel like the requirements mentioned are too specific. As @olitomlinson suggested, I think it makes more sense for this to be implemented as a layer on top of Durable Functions rather than a built-in feature. I think it's better for the Durable Functions team to instead focus on the primitives needed to make building something like background jobs more manageable. More general purpose features like "auto-start" and "auto-restart" seem like good primitives for us to expose, since implementing those yourselves is pretty non-trivial.