microsoft / azure-container-apps

Roadmap and issues for Azure Container Apps
MIT License
355 stars 27 forks source link

Updating ACA app causes downtime #1166

Open duglin opened 1 month ago

duglin commented 1 month ago

This issue is a: (mark with an x)

Issue description

When I update my ACA app config to force it to download the latest container image, I will see this error message while hitting my app:

upstream connect error or disconnect/reset before headers. retried and the latest reset reason: remote connection failure, transport failure reason: delayed connect error: 111

I suspect this is due to the network being down for a short time while it switches from the old to the new Revision. However, one of the ACA's selling points should be "zero downtime" for an app during an upgrade.

Steps to reproduce

  1. Create an ACA app
  2. Modify some config to create a new Revision
  3. continually (quickly) hit the app (e.g. hit refresh on the App's URL in the browser) and you should eventually see this error

Expected behavior [What you expected to happen.]

calleo commented 1 month ago

How my replicas are you running?

Have you defined health checks that assert if the application is running? ACA might default to just checking that the container is running, which is not necessarily the same.

eroomeoj94 commented 2 weeks ago

Hello @calleo ,

I'm getting the same issue, I've got one replica running.

Its takes about 10-15 minutes before the error message goes away and the container works accordingly.

What health checks would you recommend?