Closed sonofhammer closed 1 week ago
We are facing this issue too and the production systems are impacted. Looks like supporting image for container app jobs is rolled back from version 1.39.6 to version 1.0.8. This is a breaking change for us.
We are facing this issue too and the production systems are impacted. Looks like supporting image for container app jobs is rolled back from version 1.39.6 to version 1.0.8. This is a breaking change for us.
We are triggered from Azure Storage Queues and facing a similar issue. As SG stated the only difference with 'mcr.microsoft.com/k8se/msi-transition:1.0.8-m' is that we do not see "Created container
Curious why the team would have rolled back from msi-transition:1.39.6-m to msi-transition:1.0.8-m.
To mitigate this production issue we have switched the ACA Jobs to manual and created a temporary queue watch/job trigger app.
@sonofhammer @sg-vintri @ruvintri For msi-transition, the tag change is expected, here is detail, Before this side car always have the same tag as other system components like 1.39.6, but like other side car, the change of this side car is much less frequent than other system components, so we decide to have separate tag for this side car which is same for other side car you see like envoy-sc side car, they will have same tag.
From code base view, even the tag is changed, the code base is exactly same.
Do you see issue before? Can you send your environment information to acasupport at microsoft dot com so we can check the log to see what could be wrong and the exactly timestamp you start see the issue.
@chinadragon0515 email sent
We confirm there is a regression for job with event trigger and managed identity running on consumption v2 environment. We are working on fix.
This is RCA: The issue is caused by an KEDA version upgrade and introduced a behavior change in latest deployment. All impacted environments have been mitigated via roll back to old Keda version.
If you still meet the issue, email the timestamp of issue and environment information to us, we will check. thanks.
We've confirmed that it has been fixed for us.
Thank you.
yeah, late last week everything broke.....today it's all working:
is there anyway to "subscribe" to these changes?? breaking my jobs is one thing....not being informed of the changes to the platform that broke me is another....
adding a link to the regression announcement: https://github.com/microsoft/azure-container-apps/issues/1211
This issue is a: (mark with an x)
Issue description
A container app job with an event trigger worked on Tuesday the 18th, but is failing since Thursday the 20th.
Manual "Run Now" button in the portal still executes the container app job without issue.
Here's what happens:
Event triggers successfully on service bus queue, and creates a pod.
Successfully created pod for Job Execution '<job name redacted>'
followed by
Replica '<replica name redacted>' for Job Execution 'job name redacted' has been scheduled to run on a node.
But then immediately goes into
Pod - <replica name redacted> has exited with status Failed
with a reason of
PodDeletion
We're not even getting to the image pull log line. It just fails immediately on start.
I do not know if these details matter but here they are
Region - eastus Container registry credentials - admin credentials Container registry location - in a separate subscription from container app environment
Steps to reproduce
Expected behavior [What you expected to happen.] Container app job is triggered successfully, the image is pulled, the container is started and it executes.
Actual behavior [What actually happened.] The pod gets deleted before the image even gets a chance to be pulled.
Screenshots
If applicable, add screenshots to help explain your problem.
Additional context
Ex. Did this issue occur in the CLI or the Portal?