microsoft / azure-container-apps

Roadmap and issues for Azure Container Apps
MIT License
355 stars 27 forks source link

Feature Request: Disable/Reschedule maintenance upgrades which interrupt running Container Apps #1144

Open gdellolio-mariner opened 2 months ago

gdellolio-mariner commented 2 months ago

Is your feature request related to a problem? Please describe.

We currently have a single Container App running in Azure. We noticed that the application would randomly be gracefully stopped and restarted. This application is supposed to be running 24/7 and should be run uninterrupted. Due to the nature of the application, we can only have one instance (replica) of this container running.

After more digging, we found that Container Apps unknowingly and occassionally does behind-the-scenes updates to Container Apps which automatically restarts your running containers. For those that are running a single container, it gracefully stops your container and bring it up on a different node causing a brief period of downtime. This is problematic when you need your container to be running indefinitely.

What makes matters worse - no one knows when these updates occur so it's impossible to plan around them. In addition to not knowing when the updates will happen (and when your containers will automatically restart on you), these updates seem to happen at any time of day... not just "outside business hours". This makes Container Apps almost useless for critical applications that rely on 100% uptime.

Describe the solution you'd like.

A few solutions:

  1. Users should be able to schedule a maintenance window that works for them. This will specify when it's ok to do updates in the background.
  2. Background updates should be scheduled during "outside business hours" for the region in which it is running.

Describe alternatives you've considered.

I've considered using a different solution for hosting my container app.

  1. AKS is a good workaround but would add way too much overhead for a simple single container application.
  2. ACI is another option, but apparently this same exact upgrade process happens there too.
  3. Using a VM - This strips away all the benefits of using Azure to host applications.

Additional context.

I understand the importance of regular upgrades to ensure stability and security. Users should, at the very least, be able to select the maintenance window in which these updates occur. If that's not a possibility, updates should be done on off hours based on the region.

sczd commented 2 months ago

I agree with @gdellolio-mariner we should be able to pick a maintenance window, it's a prerequisites to use this service in production. Also, the underlying infrastructure upgrades should move replicas and revisions one by one on each availability zones, to eliminate downtime. I can't find any confirmation of this process in the documentation.

dsczltch commented 1 week ago

We agree with @gdellolio-mariner.