microsoft / azure-container-apps

Roadmap and issues for Azure Container Apps
MIT License
372 stars 29 forks source link

Container App stuck in revision (can't create a new one and can't rollback to a previous revision) #1270

Closed vinicius-batista closed 3 months ago

vinicius-batista commented 3 months ago

Please provide us with the following information:

This issue is a: (mark with an x)

Issue description

We already filled a support ticket in Azure yesterday but until now we got no response.

We released a new version At August 19, since then all new versions we tried to release are not showing in the Portal. This error is happening with 3 other container apps: cron-gb4-capps713818c5, cron-gb4-cappsffb462db, gb4-capps7c4a36de. We have tried to deploy a new revision directly via Azure Portal and it still doesn't show (not even as starting, failed or any progress/status at all). We have also tried to active an older inactive revision but we also couldn't (but this one we at least get error 500 internal server error). Either create a new container apps we got stuck in creation process and none revision is created and we got timeout after 600 seconds. We didn't made any changes in infrastructure in the past days, only changes to our application code (so newer docker images). Finally, when we try to deploy, we can see a system log saying that the revision is unhealthy because the readiness probe is unrecheable. However, the revision number is not the one being deployed, but the one that is already deployed (even though it is already deployed, working and running) so we think the deployment is referencing the old revision instead of the new one we are trying to deploy.

Steps to reproduce

  1. ..
  2. ..

Expected behavior [What you expected to happen.]

Actual behavior [What actually happened.]

Screenshots
image

image

Additional context

Ex. Did this issue occur in the CLI or the Portal?

rlachic commented 3 months ago

@vinicius-batista I have had the same issue, and was because one of the variables was duplicated in one of the last revisions. This let totally stuck and taint the Container App.

For me even doesn't show any kind of error, just trying to activate a revision, failed... always. Is a shame how this is build, is still a beta and quite unstable... Even without properly logs

So I would recommend you check the variables and secrets used, or just try to create one from scratch

vinicius-batista commented 3 months ago

Last night, after 3 days debugging we discover the same... Our duplicate variable is presented in configuration for almost 11 months and only after Aug 21 revisions start to fail.

We discover when we try to deploy a new container by hand (as all of our deploys are automated with IaC) and after setting env variables we find out....

The big problem is this kind of change is not announced and there is no single error message that indicates this error.