Open dasbdavis opened 4 years ago
@dasbdavis, Do you have an app setting WEBSITE_RUN_FROM_PACKAGE
or similar that in your function app? (Run From Package) If so, would you mind removing that and then trying the CI/CD pipeline?
I think what's possible is that your function app is typically deployed using Run From Package, which means that the site assumes that the content is deployed at /home/data/SitePackages
. But, I don't think the CI/CD webhook deployment puts the artifact there, so your site may end up using stale deployment.
If above doesn't work, would you mind sharing you function app name, and I can look to see if anything seems fishy.
Sorry for the delay-- I didn't see that you'd responded. I'll try this as soon as I can and let you know.
I've had a similar issue with python dynamically loading old grpc protobuf files that are no longer compatible with new code.
With WEBSITE_RUN_FROM_PACKAGE=0, the deployment performs an "in-place sync". Unfortunately this has a tenancy to leave old files around, especially locked files or run-time generated files (i.e. *.pyc). In our case the old files are slurped at runtime causing version mismatch errors on method dispatch.
WEBSITE_RUN_FROM_PACKAGE=1 resolves the issue but means we can't set or rotate the host/functions keys programmatically.
I believe an A/B deployment into a separate directory, rather than in-place sync, would solve this issue.
@ankitkumarr Any insight if this was ever fixed? We're having the same issues running under Consumption Plan and Premium Plan. Our CI/CD us pushing the artifact to Azure (WEBSITE_RUN_FROM_PACKAGE set to 1) and after that we're doing an Azure Functions Restart as suggested by MSFT support but that didn't help either. Would deploying to a slot and doing a hot swap help? Please advise. If you need more data I'll be happy to provide it.
@lopezbertoni, would you mind elaborating you scenario? How are you pushing the artifact to Azure? What's the publishing process, and what issue are you seeing exactly?
@ankitkumarr
Push to an Azure Function using Azure DevOps. Steps in the build pipeline are:
Release the artifact with the following steps in Azure DevOps
Issue is that we deploy the Azure Function and we check the logs in Applications Insights and see that log statements that where completely removed from the code are still being executed.
We then stopped/started the Azure Function from the portal and this issue persisted. Eventually we stopped the Azure Function for around 5 mins and then started it again and the deployed code started executing fine.
This was deployed to a Premium Service Plan.
Please advise on how to fix this or if there's a workaround other than manually stopping/starting each processor every time we deploy.
@lopezbertoni, thanks for all the info. A couple more questions that'd help me narrow down the cause --
@ankitkumarr
{
"version": "1.0.0.1349",
"commitHash": "f9b666fea571120eb9c09732519acebe9e9b0deb",
"versionDate": "2020.8.5.1",
"branchName": "staging"
}
This issue has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 4 days. It will be closed if no further activity occurs within 3 days of this comment.
@ankitkumarr Any update on this? I experienced the same problem today. Deploying from DevOps, WEBSITE_RUN_FROM_PACKAGE=1, the function seems to be running an old code after successful deployment. There was no change in our deployment scripts recently and everything seemed to be working fine until today, although maybe the problem was there before, just unnoticed.
I believe my issues were due to this Kudu bug: https://github.com/projectkudu/kudu/issues/2972.
We've worked around the issues by moving to container functions. Previously I saw code that triggered "impossible" exceptions, i.e. exceptions in lines of that didn't exist in that release.
Today we encoutered the same problem. No changes came through after redeployments. I restared the functionApp and such, but did not have effect (didn't wait very long, as suggested by @lopezbertoni ). We use Azure DevOps, and in the releasetask there, we had our "Deployment method" on "Auto-detect". Worked perfectly fine before, but now that I changed it explicitly to "Zip Deploy", our codechange came through. I'm not entirely sure that this is a fix for the problem, or just a coincidence, but I thought I'd share.
@SeppeDev Just to follow up / help. When we did a quick restart if didn't work. When we did a quick start/stop it didn't work. It picked up the new code once we stopped, waited for about 5 mins and started again.
@lopezbertoni , ok thanks, we didn't wait for 5 minutes after stopping it, just a quick restart and a quick stop and start, so what fixed is for us probably is the change in the Release in DevOps. Thanks.
@ankitkumarr This happened again with our Production deployments from last night. Yay for no deploy Fridays 😀. We deployed 3 times and it didn't update. Eventually they did after several restarts. All of these processors where deployed to a premium plan.
This issue has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 4 days. It will be closed if no further activity occurs within 3 days of this comment.
@ankitkumarr Any update on this? Any workaround at least? We're running production with 10+ Functions and every deploy we need to stop, wait for 5 mins and start the functions to ensure the latest code is running until we know this is reliable. Would slot deployment help?
@lopezbertoni, yes apologies for the delay. I will take a look at this as soon as I can.
This issue has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 4 days. It will be closed if no further activity occurs within 3 days of this comment.
We are seeing the same issue. Deployed from Azure DevOps with the built-in pipeline task. The function was stale after success deployment.
We use EP1 service plan.
As a workaround, manually stop > start it busts the old version.
We experience this issue inconsistently. The deployment is immediately effective most of the time, but there is always a chance the old version will persist. Please keep us updated on the progress. Thank you!
p.s. this is our deployment log
2020-09-16T18:45:21.1920203Z ##[section]Starting: Deploy to UAT slot
2020-09-16T18:45:21.2070301Z ==============================================================================
2020-09-16T18:45:21.2070986Z Task : Azure Functions
2020-09-16T18:45:21.2071377Z Description : Update a function app with .NET, Python, JavaScript, PowerShell, Java based web applications
2020-09-16T18:45:21.2071789Z Version : 1.163.7
2020-09-16T18:45:21.2072038Z Author : Microsoft Corporation
2020-09-16T18:45:21.2072484Z Help : https://aka.ms/azurefunctiontroubleshooting
2020-09-16T18:45:21.2072859Z ==============================================================================
2020-09-16T18:45:23.4939394Z Got service connection details for Azure App Service:'*******************'
2020-09-16T18:45:50.5476983Z Trying to update App Service Application settings. Data: {"WEBSITE_RUN_FROM_PACKAGE":"1"}
2020-09-16T18:45:50.5480950Z Deleting App Service Application settings. Data: ["WEBSITE_RUN_FROM_ZIP"]
2020-09-16T18:45:50.7841948Z App Service Application settings are already present.
2020-09-16T18:45:55.8987426Z Package deployment using ZIP Deploy initiated.
2020-09-16T18:46:07.4602934Z Successfully deployed web package to App Service.
2020-09-16T18:46:07.4608623Z NOTE: Run From Package makes wwwroot read-only, so you will receive an error when writing files to this directory.
2020-09-16T18:46:11.5460159Z Successfully added release annotation to the Application Insight : ************
2020-09-16T18:46:11.7860628Z App Service Application URL: http://**********************.azurewebsites.net
2020-09-16T18:46:11.8404113Z ##[section]Finishing: Deploy to UAT slot
I apologize! This has been slipping my priority list. I am taking a look at this now, and will have an update this week. I know @lopezbertoni shared a function app and the rough deployment time, and I am sorry for not making time to look at it earlier.
Would @chuanqisun or @lopezbertoni be able to share a recent timeframe window that the deployment failed, and the function app name that was deployed to? I will make sure to look at it right away. In the meanwhile, I will also try to reproduce this error and scatter through old logs to see if I find what went wrong.
Adding this to Sprint 85 (current sprint) to track and investigate the issue.
@ankitkumarr Thanks for looking into this. We've been just systematically/manually stopping and starting all azure functions. One of them is assessment-events-processor-qa deployed to a premium plan.
@lopezbertoni, can you share a recent time when you deployed? I will check the logs in case it didn't auto-update and if I find some symptoms of any issue.
@ankitkumarr Latest QA release is from today (9/16 )
Some processor names: assessment-events-processor-qa person-events-processor-qa notification-events-processor-qa
We (@thaishankar and I) took some time to investigate this issue. We looked at @chuanqisun's app as it was in the weird state mentioned in the issue. I wasn't able to look at @lopezbertoni's app as the mitigation is already in place there so it's difficult to tell if the issue still occurs. It seems that there may be a platform issue such that when files are changed, a notification is not generated for the Functions host to restart. There will be a fix going out in the platform to ensure such issues are avoided, but those deployments take time and current ETA would be by end of the year.
This issue should be transient, but if you are seeing this very consistently, please do reach out, it's likely caused by something else. In my meanwhile, please mitigate by restarting the app after the deployment. Please let me know if there's concerns and if someone else is facing this issue, do post your app name and the time period when you see it. I can then verify if it's the same issue.
Thank you all for your patience!
For our reference -- internally tracked to be fixed by @thaishankar in ANT91.
I am moving it out from the Sprint, but I will leave this open and assigned to me for updates.
Happened to us several times in the past months. If I remember correctly, all affected Functions are running on EP1 plan. We are also deploying using the Azure App Service deploy
task in Azure DevOps.
May I suggest removing Bitbucket from the title of the issue.
This just happend to me aswell. Deployed my function app using Azure CLI. First attempt syncing triggers seems to have failed. Retried a couple of hours later and deployment was successful but app is still running old code. Running on a premium plan.
@jimanttila Can you please share the app name and issue time?
@jimanttila Can you please share the app name and issue time?
App name: omnisynk-engine-prod-01
First failed attempt @ Tue Oct 20 2020 05:50:04 GMT+0200 Retried successfully @ Tue Oct 20 2020 07:31:44 GMT+0200
Just had the same. App Name: ffinesse-services-aimee-dev
Had to remove Bitbucket and then perform a manual publish.
Premium Plan as well.
@dustensalinas and All,
We are in the process of rolling out a fix for this which should prevent this issue from occuring. The fix should be deployed to all our scale units by early to mid December.
@dustensalinas and All,
We are in the process of rolling out a fix for this which should prevent this issue from occuring. The fix should be deployed to all our scale units by early to mid December.
How is the roll-out progressing? We're currently not experiencing issues, but since it used to be quite random it would be great with a status update. We're hosting the apps in North Europe region..
@sebb3 , the fix should be in North Europe already. And it should be deployed globally by the end of this week
@gmlion Is the issue you are reporting for the deployment at 2021-02-10 15:00 UTC? This was the only deployment that I could see for the app KrevNotificationServer2 in the last 3 days.
From the logs, it looks like we did pick up the new zip after deployment and the function app was restarted with the new zip at 2021-02-10 15:00:54 UTC.
It is possible that the issue you are reporting is different from the one that caused the problem earlier. Our earlier fix should still be good to prevent apps from executing old code.
Would you please open a new issue with the details?
@gmlion Is the issue you are reporting for the deployment at 2021-02-10 15:00 UTC? This was the only deployment that I could see for the app KrevNotificationServer2 in the last 3 days.
From the logs, it looks like we did pick up the new zip after deployment and the function app was restarted with the new zip at 2021-02-10 15:00:54 UTC.
It is possible that the issue you are reporting is different from the one that caused the problem earlier. Our earlier fix should still be good to prevent apps from executing old code.
Would you please open a new issue with the details?
It was an error on my side with an old deployment slot out of my radar. Sorry for the noise
This is happening for us. app-name : lxrpextranetcollaboration We've tried the 'stop, wait 5 minutes, spin around 3 times, restart app' but it didnt help
@RDavis3000 This happened to our app again, about a week ago. We use slots. My previous workaround of stop and restart didn't work. I even tried deleting and recreating the slot, and that didn't work either. I feel there is some magic that recycles previously deployed slot so delete/recreate slot won't purge the stale instance.
Eventually, I found this workaround: create a new slot with a different name, deploy whatever to it, and then delete the new slot, and create the slot under the old name. This seems to purge the function app from it completely.
This happened to us as well - app with durable functions, premium tier. Lost almost half x two developers getting to the bottom of it... Restart did not help. Only re-deployment of the same artifacts helped.
Details:
Surprised that such an important issue is not getting prioritised by Azure.
Having the same issue today. Any update ?
I am experiencing the same issue (old code version still running despite successful deployment) since June 4th and feel stuck:
The issue is random, but the lack (ongoing) occurrence is tough.
I am happy to get a working workaround and any update on this issue.
I'm also experiencing the same issue. I have a service bus triggered function running .net core code where some of the invocations has executed code old code. It seems that only a few of the executions used the old code. I would really like to have an status update on this issue. The old code that are being executed is from a deployment earlier than may 3rd. So from a really old deployment. Downloading the assembly from the bin folder everything looks good.
The runtime is v3 and runs using the consumption plan (Windows). The Azure Function App task from a Azure DevOps yaml pipeline is used for deployment with deploymentMethod not specified (auto).
We have the WEBSITE_RUN_FROM_PACKAGE = 1
We have the same problem mentioned by @akakaule . After deployment last Friday it seems like sometimes old code is being executed and sometimes the newly deployed code. Today we tried to deploy logs to find the error on our side, but the logs are only visible in some runs. In these cases the function behaves the same way it did before the deployment. We can also add that it doesn't seem to be related to an old instance which is still running. Additionally, it seems to be related only to updated functions in our function app. New functions we added always work with the new code. The old/updated ones sometimes seem to run the old code. Also only our production environment has the issue, our other environments work with the new code as intended. We also have slots for deployment in place.
After making sure that all our slots ran the same code (by redeploying), it now works again. We assume the traffic is somehow split between both slots, even though just the production slot should have been used. In the image below is our current setup, which doesn't seem to work correctly. Of course this defeats the purpose of using a slot swap.
Hi Folks,
Over the past couple of weeks, I've been seeing a Queue trigger being executed intermittently by a version of my function app that was deployed sometime prior to July 02, despite there being dozens of new versions deployed to the function app since that time.
Trawling through the logs, I've identified that it is a specific host/instance that is running the old code;
HostInstanceId: 49db15dc-0012-41ed-b558-4fe7aced0fdf
Cloud_RoleInstance: 5AF3CA72-637562946012875197
It appears every invocation from this instance (and only this instance) is running an implementation from an old deployment
So far, the following approaches have been unsuccessful in killing this old rogue instance;
I'm using;
Azure/functions-action@v1
)I'm having the same issue. Function app on Linux. Premium plan. Deploy from Azure DevOps. Restart. Old code runs for about 5 more minutes. New code magically starts.
Function app name: medchron-carnotaurus-eus-dev-v1-0
This is a problem. Please advise. Thanks.
Also experiencing the same issue. Function app on Linux, deployed via Github Actions. Initial deploy of code worked fine. Subsequent pushes with updates to the function....function app continues to run the old code.
I've poked around the files at https://
This is very bizarre behaviour and needs a fix ASAP :)
Also seeing this frequently (and anecdotally if we do stuff out of hours, that might just be perception). Repeatedly deploying seems to eventually solve it, but very frustrating.
Hello, In our project we are also facing it really frequently lately (We discovered that issue because of some incompatibility with our database updates but maybe it was already there before and simply never noticed it). Is there any roadmap on fixing that issue ?
We've got Bitbucket continuous deployment set up for a couple of our v3 Azure Functions (Azure Function App -> Platform Features -> Container settings -> CI/CD (Bitbucket)). No build provider selected, as the Bitbucket option doesn't seem to allow for it. Trigger branch is master. Function app is running from package. All are HTTP trigger functions. Nothing fancy.
Whenever we commit to master, it does indeed trigger a deployment-- which completes successfully. I can see the commit that triggered the deployment and all of that. The problem is, the function is still (usually) running old code after this deployment. I've tried restarting the function app, no success. I've tried completely stopping the app waiting for a bit and restarting, no success. Sometimes even manual deployment from Visual Studio doesn't work.
Once I remove the CI/CD pipe from Bitbucket, however, things go back to normal as far as manual deployments from VS go.
I've been able to reproduce this effect several times. Please let me know if you need any additional information.