Open j-fulbright opened 3 months ago
Hi @j-fulbright sorry to hear this was a tricky one to diagnose.
This provider doesn't do any validation of input combination itself - it only checks the shape of the schema. The validation of input combinations is done by the Azure service itself as these can vary by service version, of which there are many thousands in total. This provider's responsibility is to relay any validation failures back to the user.
It sounds like there is some odd service behaviour happening when enabling the "always on" option. The best way we can help is if we've got a simple set of steps to reproduce the issue.
It sounds like this was only failing when trying to transition a FunctionApp with the setting alwaysOn: true
from a dedication "App Service Plan" to the "Consumption" model at which point the resource operation was reporting that another operation was in progress. Does that summarise it correctly?
There might be an issue to raise with Microsoft for them to improve the error messages for this scenario. We could also add a specific note about this migration type to the documentation to help others hitting this issue. However it would be good to be able to prove the root cause of the issue first.
Please could you share a simple code snippet summarising the options used for the initial and modified configuration of the FunctionApp and it's service plan? Thanks!
Thanks @danielrbradley
Your summarization is pretty spot on, although this was a new function app being spun up on a consumption plan, when we previously were using a App Service Plan. It was from a full torn down stack so it wasn't that we were trying to migrate it from one plan to another. (Just wanted to make sure that was clear)
Regular plan:
const appServicePlan = new azureNative.web.AppServicePlan(`asp-${suffix}`,
{
resourceGroupName: resourceGroup.name,
location: resourceGroup.location,
kind: 'app',
reserved: true,
sku: {
name: stackName === 'prod' ? 'P3V3' : 'B3',
tier: stackName === 'prod' ? 'PremiumV3' : 'Basic',
capacity: stackName === 'prod' ? 2 : 1,
},
},
{
dependsOn: [resourceGroup]
}
);
Consumption plan:
const functionAppServicePlan = new azureNative.web.AppServicePlan(`asp-func-${suffix}`,
{
resourceGroupName: functionResourceGroup.name,
location: functionResourceGroup.location,
kind: 'functionapp',
reserved: true,
sku: {
name: 'Y1',
tier: 'Dynamic',
},
},
{
dependsOn: [functionResourceGroup]
}
);
We have a custom resource built to handle misc things like file blobs and settings per app, since we need to spin up 7 apps in our stack. so ill paste relevant settings for function app
// Build the app settings array
let appSettingsArray = [
{
name: 'DD_ENV',
value: stackName === 'prod' ? 'production' : stackName,
},
{
name: 'DD_SERVICE',
value: 'abcdef',
},
{
name: 'WEBSITE_RUN_FROM_PACKAGE',
value: fileBlob.url,
},
{
name: 'WEBSITE_START_SCM_ON_SITE_CREATION',
value: pulumi.output('1'),
},
{
name: 'WEBSITE_TIME_ZONE', // NOT supported for Function App on Consumption Plan
value: pulumi.output('America/Chicago'),
},
];
if (args.kind === 'functionapp') {
appSettingsArray = appSettingsArray.concat([
{
name: 'FUNCTIONS_EXTENSION_VERSION',
value: pulumi.output('~4'),
},
{
name: 'FUNCTIONS_WORKER_RUNTIME',
value: pulumi.output('dotnet-isolated'),
},
{
name: 'AzureWebJobsStorage',
value: storageAccountConnectionString,
}
]);
}
const app = new azureNative.web.WebApp(
`as-${name}-${suffix}`,
{
resourceGroupName: args.altResourceGroupName || args.resourceGroupName,
location: args.location,
serverFarmId: args.serverFarmId,
name: `${name}-${suffix}`,
siteConfig: {
linuxFxVersion: args.kind === 'functionapp' ? 'DOTNET-ISOLATED|8.0' : 'DOTNETCORE|8.0',
appSettings: appSettingsArray,
alwaysOn: args.kind === 'functionapp' ? false : true, //alwaysOn set to true breaks deploy of function apps in consumption plan, as it is not supported
http20Enabled: true,
},
identity: {
type: azureNative.web.ManagedServiceIdentityType.SystemAssigned,
},
kind: args.kind,
httpsOnly: true,
clientAffinityEnabled: false,
},
{ parent: this, dependsOn: [appInsights] },
);
Mainly it is just the alwaysOn value that when set to true causes the operation is in progress errors, as soon as it was set to false, it deployed successfully and has every time
I was able to reproduce this with a super simple app.
const functionResourceGroup = new azureNative.resources.ResourceGroup(`rgf-${suffix}`, {
resourceGroupName: `rgf-${suffix}`,
location: 'centralus',
});
const functionAppServicePlan = new azureNative.web.AppServicePlan(`asp-func-${suffix}`,
{
resourceGroupName: functionResourceGroup.name,
location: functionResourceGroup.location,
kind: 'functionapp',
reserved: true,
sku: {
name: 'Y1',
tier: 'Dynamic',
},
},
{
dependsOn: [functionResourceGroup]
}
);
// Create empty initial webapps
const functionsApp = new azureNative.web.WebApp(
`as-functions-${suffix}`, {
resourceGroupName: functionResourceGroup.name,
location: functionResourceGroup.location,
serverFarmId: functionAppServicePlan.id,
name: `functions-${suffix}`,
kind: 'functionapp,linux',
siteConfig: {
linuxFxVersion: 'DOTNET-ISOLATED|8.0',
alwaysOn: false,
},
});
This one works, but as soon as kind
or alwaysOn
was removed, it would cause the same issue, so it seems like the already in process
message is just hiding the actual error that Azure is returning, which does make an Azure API/Cli issue more than likely.
Thanks for the feedback @j-fulbright
Deploying your example above and updating alwaysOn: true
resulted in the error: error: autorest/azure: Service returned an error. Status=<nil> <nil>. There was a conflict. AlwaysOn cannot be set for this site as the plan does not allow it. For more information on pricing and features, please see: https://aka.ms/appservicepricingdetails
Deploying your example and removing the alwaysOn: false
deploys without error.
Deploying your example without the WebApp
kind
property fails with the same error you were seeing: error: autorest/azure: Service returned an error. Status=<nil> <nil>. Cannot modify this site because another operation is in progress. Details: Id: c59de21f-8b95-4717-872a-ff1bdf0ec924, OperationName: Create, CreatedTime: 7/11/2024 8:51:44 AM, RequestId: 032bb8fe-959c-4786-b8e3-1bddb35d9b09, EntityType: 3
. Removing the kind
property from an existing WebApp will cause a replacement meaning it's the same as the above error as it's just doing a brand new create.
When running with verbose logging (pulumi up --yes --skip-preview --logtostderr --logflow -v=10 2> out.txt
) we see that we're actually making two separate PUT requests when attempting the Create operation for the resource. The first PUT request contains the more helpful response message: Consumption pricing tier cannot be used for regular web apps.
.
Attaching a debugger allowed me to locate the source of the double request. This is triggered by the go-autorest
library's retry mechanism. This code specifically looks to retry errors which have "409 Conflict" status. When it then retries again in quick succession, subsequent requests fail with "409 Conflict" status again, but this time with the message "Cannot modify this site because another operation is in progress. Details: Id: 7671ff14-2e7b-4ebd-98d7-6eee53ae2b27, OperationName: Create, CreatedTime: 7/11/2024 10:56:27 AM, RequestId: abdbabfc-49f2-4f6c-bdad-1fcb9c9527b8, EntityType: 3"
. I think this is because attempting to create a Web App temporarily modifies the associated AppServicePlan
internally. The last error message is then returned to the user, therefore hiding the original, more helpful, error message.
From these observations I can confirm that this is an Azure API bug because the validation failure should be returned with a status code of 4xx which would then not be retried and a useful error message would be returned.
I would recommend approaching Azure support with this information and request they fix the status code of the original validation failure error.
Just now seeing this @danielrbradley ! Thank you so much! I copied your command line for reference in the future to look into things.
This makes sense that there is a lot of requests being made and were only getting the one back, so definitely an issue with Azure API unfortunately
What happened?
I believe I've seen this issue reported in another place but it was related to the handling of the AnotherOperationInProgress error codes
Moving our function app to a Consumption Service plan was causing a lot of issues, with the below error.
After a full day of logging and trying new things, I believe I have determined it is due to the
alwaysOn
value in siteConfigAdding the above handling in my code, so it is turned off for my function app only, resolved the issue.
https://github.com/Azure/Azure-Functions/wiki/Enable-Always-On-when-running-on-dedicated-App-Service-Plan
Example
Output of
pulumi about
na
Additional context
Seems like this likely should be handled at the Pulumi level to either error or ignore the setting, if possible, if we know the plan is a consumption plan
Contributing
Vote on this issue by adding a 👍 reaction. To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).