microsoft / azure-container-apps

Roadmap and issues for Azure Container Apps
MIT License
361 stars 29 forks source link

Avoiding parallelism in scheduled Azure Container App Jobs #1271

Open ChristianMorup opened 2 weeks ago

ChristianMorup commented 2 weeks ago

Please provide us with the following information:

This issue is a: (mark with an x)

Issue description

We have a scheduled Azure Container App Job that runs every minute. However, it may sometimes take significantly longer to complete. Initially, we expected that setting the following configuration in Bicep:

scheduleTriggerConfig: {
        cronExpression: '0/1 * * * *'
        parallelism: 1
        replicaCompletionCount: 1
      }

would ensure that only one replica would be running at any given time. The documentation is somewhat ambiguous. The Bicep documentation states that the parallelism parameter defines the "Number of parallel replicas of a job that can run at a given time." In contrast, the Azure Container App Job documentation states that parallelism is "The number of replicas to run per execution. For most jobs, set the value to 1." Our experience aligns with the latter interpretation.

Steps to reproduce

  1. Create a scheduled Container App Job that runs for longer than the cron expression
  2. Set the parallelism parameter in bicep to 1

Expected behavior We expected this to set a maximum number of job replicas that could run at any given time (based largely on the Bicep documentation). I acknowledge that this depends on which documentation you reference. If this behavior is expected, it would be helpful to have a way to ensure that only one replica runs at any given time.

Actual behavior The parallelism flag sets the maximum number of replicas per execution, not the maximum number of replicas running at any given time.

anthonychu commented 2 weeks ago

@ChristianMorup You're right. parallelism is how many replicas are started per job execution. It's a really advanced setting. It should almost always be set to 1.

Container Apps' scheduled jobs are powered by Kubernete's CronJobs. We currently don't expose the concurrencyPolicy setting, which would allow you to control whether new executions are started when there's an execution still running. We have it on our backlog to add support for this but currently don't have an ETA. We'll see if we can add it to an upcoming API version in a couple of months.

Without that, you'll need to create your own mechanism to ensure only 1 execution runs at a time. One way to do this is to leverage blob leases.

ChristianMorup commented 2 weeks ago

@anthonychu Thank you for the clarification! I agree that having a concurrencyPolicy setting would be incredibly useful, especially for scheduled jobs where ensuring only one execution at a time is often critical. In our use case - and likely for many others - the current parallelism flag isn't as relevant because cron jobs are typically not designed to run multiple instances simultaneously.

Implementing a concurrencyPolicy setting would significantly improve our ability to manage job execution without having to build custom mechanisms like blob leases, thus preventing conflicts caused by overlapping runs.

We look forward to seeing this feature in an upcoming API version, and it would be greatly appreciated if it could be prioritized. Thank you for considering this improvement!

calleo commented 1 day ago

Exposing concurrencyPolicy would be a great addition!