Open andre-qumulo opened 1 year ago
Thanks @andre-qumulo for raising this issue. May I confirm:
linuxFxVersion
and the value of docker part with me?Part of the challenge is that it's inconsistent. Sometimes the linuxFxVersion looks like "DOCKER|login-server-url/test-repo:latest" and sometimes it's "DOCKER|/test-repo:latest"
I ran with TF_LOG=TRACE until I could reproduce it:
I can see the PUT to
PUT /subscriptions/REDACTED/resourceGroups/REDACTED/providers/Microsoft.Web/sites/test-app-agagne?api-version=2021-03-01 HTTP/1.1
has "linuxFxVersion":"DOCKER|/test-repo:latest"
I see this occurring as well. Both the image_name
randomly having a /
and the issue of linuxFxVersion
randomly not having the registry URL
I am wondering if this is because we're getting the login server url as an ACR data instead of resource so the url isn't known?
I also wonder if part of the problem is that the function is trying to redeploy every time as a result of azure adding an "https://" to the registry name.
I recently changed my terraform to have:
application_stack {
docker {
registry_url = "https://${data.azurerm_container_registry.services_acr.login_server}"
image_name = "test-repo"
image_tag = "latest"
}
}
And haven't seen terraform try to change that section since. I don't think this is a guaranteed workaround but might reduce the frequency of failures.
Looks like someone else came to this same conclusion in this issue https://github.com/hashicorp/terraform-provider-azurerm/issues/20393
This definitely feels like a bug with the azurerm provider. I think if we don't provide the "https://" then azure will add it, the provider sees that and thinks its a difference. I suspect there's a race condition with using a data "azurerm_container_registry"
where terraform doesn't get the login server and thus doesn't update it correctly.
I am seeing in the trace that the GET request for the linux function app also has a linuxFxVersion that's missing the login server url
@andre-qumulo can you use login_server instead of id?
docker {
registry_url = data.azurerm_container_registry.test.login_server
image_name = "dotnetsamples"
image_tag = "aspnetapp"
}
@andre-qumulo can you use login_server instead of id?
docker { registry_url = data.azurerm_container_registry.test.login_server image_name = "dotnetsamples" image_tag = "aspnetapp" }
@xiaxyi This is a problem with the example I gave you but not the problem I'm calling out. I already did the analysis (see the conversation above your reply). Can you point me to the code where this is set and maybe I'll take care of it myself?
@andre-qumulo we are currently considering change the way of how to compose linuxFxVersion
, let me also add your issue into the issue list, will update once there is a progress.
Understood,
Thank you! If you do point me at the code I can make a small pr to fix this issue. Might be quick.
Andre
On Wed, Feb 22, 2023, 5:54 PM Xiaxin @.***> wrote:
@andre-qumulo https://github.com/andre-qumulo we are currently considering change the way of how to compose linuxFxVersion, let me also add your issue into the issue list, will update once there is a progress.
— Reply to this email directly, view it on GitHub https://github.com/hashicorp/terraform-provider-azurerm/issues/20441#issuecomment-1441123588, or unsubscribe https://github.com/notifications/unsubscribe-auth/A4OQ326SCYVWMDYVUMLH6EDWY27M7ANCNFSM6AAAAAAU2TFTIU . You are receiving this because you were mentioned.Message ID: @.***>
@andre-qumulo Thanks a lot for offering the help! There are multiple places relates to the docker settings, not only of the way to compose linuxFxVersion
but we need to update the app_setting
block as well. you may check this function
Besides, the change to docker stack will also include linux web app.
Same issue here with hashicorp/azurerm == 3.45.0
Configuring the Docker as suggested by @andre-qumulo , but it always results in:
"linuxFxVersion": "DOCKER|/image/name:mytag"
With:
docker {
registry_url = "myrepo.azurecr.io"
image_name = "image/name"
image_tag = "mytag"
}
Looks like it's fixed with 3.47.0
.
@mickare Still present with 3.61.0.
Is there any update on this bug or an ETA when it will be fixed/released?
We are facing this issue with version 3.57.0. I wanted to add that the workaround suggested by @andre-qumulo has worked in fixing this for us: It appears that it is a bug in whatever code has been added in azurerm_linux_function_app to auto populate the https:// prefix in cases where users have omitted it. The fact that appears to happens for us exclusively on subsequent releases with new versions to existing resources (as well as having a poke around inside the state files) leads me to believe its an issue with some erroneous editing of the state file post deployment.
As far as I can tell, I believe the issue lies here in the provider code:
if linuxAppStack.Docker != nil && len(linuxAppStack.Docker) == 1 {
dockerConfig := linuxAppStack.Docker[0]
appSettings = updateOrAppendAppSettings(appSettings, "DOCKER_REGISTRY_SERVER_URL", dockerConfig.RegistryURL, false)
appSettings = updateOrAppendAppSettings(appSettings, "DOCKER_REGISTRY_SERVER_USERNAME", dockerConfig.RegistryUsername, false)
appSettings = updateOrAppendAppSettings(appSettings, "DOCKER_REGISTRY_SERVER_PASSWORD", dockerConfig.RegistryPassword, false)
var dockerUrl string
for _, prefix := range urlSchemes {
if strings.HasPrefix(dockerConfig.RegistryURL, prefix) {
dockerUrl = strings.TrimPrefix(dockerConfig.RegistryURL, prefix)
continue
}
}
expanded.LinuxFxVersion = utils.String(fmt.Sprintf("DOCKER|%s/%s:%s", dockerUrl, dockerConfig.ImageName, dockerConfig.ImageTag))
}
If the dockerConfig.RegistryURL does not have a prefix in the urlSchemes range, the dockerUrl variable will remain an empty string and the expanded.LinuxFxVersion will be set to "DOCKER|/:image-name:image-tag".
This is the observed behaviour - unless folks provide a https:// then the dockerUrl is blank.
Assuming this is the issue, a fix could be to set a default before the evaluation takes place so that if no evaluation is required, the string is set correctly:
dockerUrl := dockerConfig.RegistryURL // i.e. Set a default value
for _, prefix := range urlSchemes {
if strings.HasPrefix(dockerConfig.RegistryURL, prefix) {
dockerUrl = strings.TrimPrefix(dockerConfig.RegistryURL, prefix)
break
}
}
expanded.LinuxFxVersion = utils.String(fmt.Sprintf("DOCKER|%s/%s:%s", dockerUrl, dockerConfig.ImageName, dockerConfig.ImageTag))
I have resolved adding the docker_registry_url = "https://index.docker.io"
to the application_stack block. Please, replace the docker_image_name
with your docker image
site_config {
always_on = true
ftps_state = "FtpsOnly"
application_stack {
docker_image_name = "sonarqube:lts-community"
docker_registry_url = "https://index.docker.io"
}
}
I have resolved adding the
docker_registry_url = "https://index.docker.io"
to the application_stack block. Please, replace thedocker_image_name
with your docker imagesite_config { always_on = true ftps_state = "FtpsOnly" application_stack { docker_image_name = "sonarqube:lts-community" docker_registry_url = "https://index.docker.io" } }
Where is the commit for this fix?
I'm seeing this in a non-reproducible way for azurerm_linux_web_app
.
Sometimes the terraform state contains BAD "linux_fx_version": "DOCKER|/myregistry.azurecr.io/myrepo:latest"
, and sometimes GOOD "linux_fx_version": "DOCKER|myregistry.azurecr.io/myrepo:latest"
.
In both cases the azurerm_linux_web_app
has a dependency on azurerm_container_registry.this
in the state.
It happened a few times over the past couple of weeks, then it did not happen again.
Using terraform 1.7.4 and azurerm 3.94.0
locals {
...
primary_docker_name = "${azurerm_container_registry.this.login_server}/myrepo"
primary_docker_tag = "latest"
}
...
resource "azurerm_linux_web_app" "this" {
site_config {
application_stack {
...
docker_image_name = "${local.primary_docker_name}:${local.primary_docker_tag}"
}
}
Not using docker_registry_url
at all.
Same Issue here yesterday and today multiple App Services goes down because cannot get the image. Other App Services with the same config don't have the problem in the same deployment and nothing is showed on the plan or deployment, only you can see in the activity logs that Terraform change the image Url adding the / in the beggining and delete the Docker variables.
Using Terraform 1.7.4-1 and azurerm 3.95.0
This has happened several times to me with app service.
I noticed whenever this has happened i've seen empty values in configuration settings for DOCKER_REGISTRY_SERVER_PASSWORD, DOCKER_REGISTRY_SERVER_USERNAME, DOCKER_REGISTRY_SERVER_URL (possibly related to https://github.com/hashicorp/terraform-provider-azurerm/pull/24424/ )
we don't use this and we dont specify the image in the terraform, we leave that for deploys via ADO. I'm not sure if these are occasionally being added empty by terraform when the error occurs or if when we created the resources initially we used a buggy provider version (its notable that only resources created since around november and before feb had these values empty).
I've removed them all from affected, and will report back if they appear again.
In my case this appears only one time when enable that site config with the lates release of the azurerm 3.95.0 on different subscriptions, regions, days and cloud infra.
2024-03-13T13:29:03.7933364Z ~ site_config { 2024-03-13T13:29:03.7933537Z + ip_restriction_default_action = "Allow" 2024-03-13T13:29:03.7933748Z + scm_ip_restriction_default_action = "Allow"
2024-03-11T09:52:21.7790692Z ~ site_config { 2024-03-11T09:52:21.7790978Z + ip_restriction_default_action = "Allow" 2024-03-11T09:52:21.7791446Z + scm_ip_restriction_default_action = "Allow"
This is the only in the plan, but in background do more than this over some of the resources.
We continue with the investigation and we found that the problem is the site_config[0].linuxFxVersion that is defined by the provider and incorrectly saved to the state file. I review the history of the state files from last year and multiple web apps have the wrong setting, so now that edit some values there, because the rest is managed by ADO with CI/CD Terraform set what they have in the state file without any notice.
I found the issue. In our setup in the terraform we dont mention any images or docker parameters.
We then deploy via Azure Devops AzureRmWebAppDeployment task which sets up the docker parameters and everything works.
That is until we make a change to the app service resource (for example app settings, but anything will do it). This causes docker_image_name to get set in the state to e.g. ourregistry.azurecr.io/image:1.2 docker_custom_image_name gets updated in app settings to the same value and linux_fx_version gets set to /registry.azurecr.io/image:1.2
i found that setting DOCKER_REGISTRY_SERVER_URL to https://ourregistry.azurecr.io during the AzureRmWebAppDeployment task works around that, and when the next terraform infra job is run the state gets updated differently
docker_image_name becomes image:1.2 docker_registry_url gets set to the https://ourregistry.azurecr.io and linux_fx_version gets set correctly without the leading /
so it seems its definitely a bug in the provider.
Is there an existing issue for this?
Community Note
Terraform Version
1.3.8
AzureRM Provider Version
3.43.0
Affected Resource(s)/Data Source(s)
azurerm_linux_function_app
Terraform Configuration Files
Debug Output/Panic Output
Expected Behaviour
The deployment configuration in azure's image field shouldn't have a "/" in the front
Actual Behaviour
The deployment configuration in azure's image field has a "/" in the front
Steps to Reproduce
You can remove the leading "/", restart the app, and it will load. If you re-run "apply" it will add the leading "/" to it and fail to pull again.
Important Factoids
No response
References
No response