Open kon1180 opened 3 years ago
Seeing exactly the same thing
Seeing exactly the same thing
Is there another resource provider you are using so as to not have this issue? I know using azurerm_resource_group_template_deployment was buggy in the same way as above, so I switched back to the older azurerm_template_deployment, which does the exact same thing but no problems.
a quick work around, I ran apply with the script I wanted, then set this within the resource blocks as I cannot have production having changes every single time, etc:
# ignore_changes is here given the bug I reported: https://github.com/terraform-providers/terraform-provider-azurerm/issues/10388
lifecycle {
ignore_changes = [log, metric]
}
this looks to be the result of casing mismatch (at least on my end with 2.45.0):
.../Microsoft.OperationalInsights/workspaces/... -> .../microsoft.operationalinsights/workspaces/...
(forces replacement)
this looks to be the result of casing mismatch (at least on my end with 2.45.0):
.../Microsoft.OperationalInsights/workspaces/... -> .../microsoft.operationalinsights/workspaces/...
(forces replacement)
Changing to all lower case didn't work, ended up forcing replacement on 2.44 just as you saw on 2.45
the case mismatch is a bug between the azure api and the provider. this has affected many other az resources. i am saying that said casing bug is now impacting this resource.
eg, search for "case" in the provider changelog: https://github.com/terraform-providers/terraform-provider-azurerm/blob/master/CHANGELOG-v2.md
woof. i just realized that this behavior is quite problematic, particularly with linux webapps. The replacement of the resource triggers the container to be recycled twice: first as TF removes the diag setting and again as it puts it back.
ouch
please treat the casing issue as high priority. ongoing destruction and recreation is unfit for production
please treat the casing issue as high priority. ongoing destruction and recreation is unfit for production
Is there a way for me to do that?
@kon1180 can you run a plan (without your workaround) and see if the resourceID is being changed (per casing) forcing a replace ?
I experience what also appears to be the same issue as the owner with Terraform attempting to update the resource due to apparent changes across all log
and metric
blocks not defined in the config.
This seems unrelated to #10857 as highlighted by @surlypants.
@kon1180 can you run a plan (without your workaround) and see if the resourceID is being changed (per casing) forcing a replace ?
The log_analytics_workspace_id
is identified as unchanged in this case.
As far as I can tell it is a "Case" issue that makes it believe that it has been altered. The state file contains something like log_analytics_workspace_id: /subscriptions/GUID/resourceGroups/rg/providers/Microsoft.OperationalInsights/workspaces/ws"
and it wants to alter it to all lowercase.
Also running into this - i've noticed it will also apply a retention policy even if you've not given one in your configuration.
I am facing the same issue. Every time we deploy change terraform shows update-in-place. It also happens with Global parameters for Azure Data Factory resource.
same here - bug unsolved - i am astonished that this is unsolved since 9 months
I did a bit of an investigation: As soon as one diagnostic setting (regardless whether it is "log" or "monitor") is enabled for a target resource, the API returns all available diagnostic settings for that target resource with the default values (enabled: false and retention: 0):
az rest --method get --url https://management.azure.com/subscriptions/****/resourceGroups/test-for-github-10388/providers/Microsoft.KeyVault/vaults/test-for-github-10388/providers/Microsoft.Insights/diagnosticSettings/github-issue-10388?api-version=2021-05-01-preview
{
"id": "/subscriptions/****/resourcegroups/test-for-github-10388/providers/microsoft.keyvault/vaults/test-for-github-10388/providers/microsoft.insights/diagnosticSettings/github-issue-10388",
"identity": null,
"kind": null,
"location": null,
"name": "github-issue-10388",
"properties": {
"eventHubAuthorizationRuleId": null,
"eventHubName": null,
"logAnalyticsDestinationType": null,
"logs": [
{
"category": "AuditEvent",
"categoryGroup": null,
"enabled": false,
"retentionPolicy": {
"days": 0,
"enabled": false
}
},
{
"category": "AzurePolicyEvaluationDetails",
"categoryGroup": null,
"enabled": false,
"retentionPolicy": {
"days": 0,
"enabled": false
}
}
],
"metrics": [
{
"category": "AllMetrics",
"enabled": true,
"retentionPolicy": {
"days": 0,
"enabled": false
}
}
],
"serviceBusRuleId": null,
"storageAccountId": null,
"workspaceId": "/subscriptions/****/resourceGroups/oasis-tst-loganalytics/providers/Microsoft.OperationalInsights/workspaces/oasis-tst-analytics"
},
"resourceGroup": "test-for-github-10388",
"tags": null,
"type": "Microsoft.Insights/diagnosticSettings"
}
Terraform then interprets this existence as change from null to false and wants to change it back - which does not work. A simple way of changing the behaviour would be to only check for enabled settings and treat "null" and "disabled" identical. E.g. by adding
if *v.Enabled{
results = append(results, output)
}
This has some side-effects though when explicitly specifying "enabled: false" like it is done in the tests (e.g. https://github.com/hashicorp/terraform-provider-azurerm/blob/main/internal/services/monitor/monitor_diagnostic_setting_resource_test.go#L277) The setup would mean to either specify "enabled = true" or to not specify the block at all... not sure if this is desired.
I would like to see the proposed solution implemented. Some Azure Policy and Microsoft Defender for cloud will automatically enables diagnostic log. This operation will then change the state as explained above.
The ignore lifecycle doesn't work in this scenario.
For us, the workaround is to specify the automatic policy in the terraform code as if some audit-log were setup by terraform. It works but we would like to keep some diagnostic log (specially the security ones) outside of terraform and we currently are unable to.
The proposed "explicitly specify enabled=true/false to manage" would be perfect for us.
I can propose a PR with the changes if that's OK for you.
Still happening.
As soon as you define anything, azure automatically sets everything else to 0 if not defined.
Terraform sees that 0 as a new change, since 0 isn't NULL.
I am seeing similar problem for AKS Diagnostics where I split diff log category to two types of storage (few configured to store in Log Analytics and few in Storage Account).
Below is my TF file to create the azurerm_monitor_diagnostice_setting
resources:
resource "azurerm_monitor_diagnostic_setting" "aks_audit" {
lifecycle {
ignore_changes = [target_resource_id]
}
name = ${local.diag_name}
log_analytics_workspace_id = azurerm_log_analytics_workspace.workspace.id
target_resource_id = data.azurerm_kubernetes_cluster.aks.id
log {
category = "cloud-controller-manager"
enabled = true
retention_policy {
days = 90
enabled = true
}
}
log {
category = "cluster-autoscaler"
enabled = true
retention_policy {
days = 90
enabled = true
}
}
log {
category = "kube-apiserver"
enabled = true
retention_policy {
days = 90
enabled = true
}
}
log {
category = "kube-controller-manager"
enabled = true
retention_policy {
days = 90
enabled = true
}
}
log {
category = "kube-scheduler"
enabled = true
retention_policy {
days = 90
enabled = true
}
}
log {
category = "guard"
enabled = true
retention_policy {
days = 90
enabled = true
}
}
metric {
category = "AllMetrics"
enabled = true
retention_policy {
days = 90
enabled = true
}
}
}
resource "azurerm_monitor_diagnostic_setting" "aks_audit_sa" {
lifecycle {
ignore_changes = [target_resource_id]
}
name = ${local.sa_diag_name}
storage_account_id = azurerm_storage_account.audit_sa.id
target_resource_id = data.azurerm_kubernetes_cluster.aks.id
log {
category = "kube-audit"
enabled = true
retention_policy {
days = 90
enabled = true
}
}
log {
category = "kube-audit-admin"
enabled = true
retention_policy {
days = 90
enabled = true
}
}
}
The plan shows something like
# azurerm_monitor_diagnostic_setting.aks_audit will be updated in-place
~ resource "azurerm_monitor_diagnostic_setting" "aks_audit" {
id = <REDACTED>
name = <REDACTED>
# (2 unchanged attributes hidden)
- log {
- category = "csi-azuredisk-controller" -> null
- enabled = false -> null
- retention_policy {
- days = 0 -> null
- enabled = false -> null
}
}
- log {
- category = "csi-azurefile-controller" -> null
- enabled = false -> null
- retention_policy {
- days = 0 -> null
- enabled = false -> null
}
}
- log {
- category = "csi-snapshot-controller" -> null
- enabled = false -> null
- retention_policy {
- days = 0 -> null
- enabled = false -> null
}
}
- log {
- category = "kube-audit" -> null
- enabled = false -> null
- retention_policy {
- days = 0 -> null
- enabled = false -> null
}
}
- log {
- category = "kube-audit-admin" -> null
- enabled = false -> null
- retention_policy {
- days = 0 -> null
- enabled = false -> null
}
}
# (7 unchanged blocks hidden)
}
# azurerm_monitor_diagnostic_setting.aks_audit_sa will be updated in-place
~ resource "azurerm_monitor_diagnostic_setting" "aks_audit_sa" {
id = <REDACTED>
name = <REDACTED>
# (2 unchanged attributes hidden)
- log {
- category = "cloud-controller-manager" -> null
- enabled = false -> null
- retention_policy {
- days = 0 -> null
- enabled = false -> null
}
}
- log {
- category = "cluster-autoscaler" -> null
- enabled = false -> null
- retention_policy {
- days = 0 -> null
- enabled = false -> null
}
}
- log {
- category = "csi-azuredisk-controller" -> null
- enabled = false -> null
- retention_policy {
- days = 0 -> null
- enabled = false -> null
}
}
- log {
- category = "csi-azurefile-controller" -> null
- enabled = false -> null
- retention_policy {
- days = 0 -> null
- enabled = false -> null
}
}
- log {
- category = "csi-snapshot-controller" -> null
- enabled = false -> null
- retention_policy {
- days = 0 -> null
- enabled = false -> null
}
}
- log {
- category = "guard" -> null
- enabled = false -> null
- retention_policy {
- days = 0 -> null
- enabled = false -> null
}
}
- log {
- category = "kube-apiserver" -> null
- enabled = false -> null
- retention_policy {
- days = 0 -> null
- enabled = false -> null
}
}
- log {
- category = "kube-controller-manager" -> null
- enabled = false -> null
- retention_policy {
- days = 0 -> null
- enabled = false -> null
}
}
- log {
- category = "kube-scheduler" -> null
- enabled = false -> null
- retention_policy {
- days = 0 -> null
- enabled = false -> null
}
}
- metric {
- category = "AllMetrics" -> null
- enabled = false -> null
- retention_policy {
- days = 0 -> null
- enabled = false -> null
}
}
# (2 unchanged blocks hidden)
}
Plan: 0 to add, 2 to change, 0 to destroy.
This is the plan on every TF plan exec, w/o any variable changes. Had to add the lifecycle meta arg when this happened last time but the mentioned request handling on the Azure API end is forcing the provider to identify a change every time.
I had to explicitly specify all the categories(log as well as metric) that I don't need as below to make TF dodge the "undesired" changes.
ex:
log {
category = "CATEGORY"
enabled = false
retention_policy {
days = 0
enabled = false
}
}
Till this issue remains open/unresolved, I need to keep adding whenever a new diagnostic log category is introduced by Azure and enforced.
Hi all,
Below my workaround which resulted in no unneeded changes detected. AzureRM provider version: 2.93.1 Terraform version: 0.13.7
1) a local list variable containing the categories I want to enable logs/metrics
locals {
example_recovery_vault_diagnostic_categories_enabled = [
"AzureSiteRecoveryJobs",
"AzureSiteRecoveryRecoveryPoints"
]
}
2) data source to get ALL the available categories
data "azurerm_monitor_diagnostic_categories" "example" {
resource_id = azurerm_recovery_services_vault.example.id
}
3) dynamic block that generates all the categories including the ones I need, and also the ones I want to disable with respective settings. Condition for "enable = true" is as you can see, if the category matches my list or not.
resource "azurerm_monitor_diagnostic_setting" "example" {
name = azurerm_recovery_services_vault.example.name
target_resource_id = azurerm_recovery_services_vault.example.id
log_analytics_workspace_id = data.azurerm_log_analytics_workspace.example.id
log_analytics_destination_type = "AzureDiagnostics"
dynamic "log" {
iterator = log_category
for_each = data.azurerm_monitor_diagnostic_categories.example.logs
content {
enabled = contains(local.example_recovery_vault_diagnostic_categories_enabled,log_category.value) ? true : false
category = log_category.value
retention_policy {
days = 0
enabled = contains(local.example_recovery_vault_diagnostic_categories_enabled,log_category.value) ? true : false
}
}
}
dynamic "metric" {
iterator = metric_category
for_each = data.azurerm_monitor_diagnostic_categories.example.metrics
content {
enabled = contains(local.example_recovery_vault_diagnostic_categories_enabled,metric_category.value) ? true : false
category = metric_category.value
retention_policy {
days = 0
enabled = contains(local.example_recovery_vault_diagnostic_categories_enabled,metric_category.value) ? true : false
}
}
}
}
@murdibb worked for us - thanks!
It works for us as well. Tested it.
Hi all,
Below my workaround which resulted in no unneeded changes detected. AzureRM provider version: 2.93.1 Terraform version: 0.13.7
- a local list variable containing the categories I want to enable logs/metrics
locals { example_recovery_vault_diagnostic_categories_enabled = [ "AzureSiteRecoveryJobs", "AzureSiteRecoveryRecoveryPoints" ] }
- data source to get ALL the available categories
data "azurerm_monitor_diagnostic_categories" "example" { resource_id = azurerm_recovery_services_vault.example.id }
- dynamic block that generates all the categories including the ones I need, and also the ones I want to disable with respective settings. Condition for "enable = true" is as you can see, if the category matches my list or not.
resource "azurerm_monitor_diagnostic_setting" "example" { name = azurerm_recovery_services_vault.example.name target_resource_id = azurerm_recovery_services_vault.example.id log_analytics_workspace_id = data.azurerm_log_analytics_workspace.example.id log_analytics_destination_type = "AzureDiagnostics" dynamic "log" { iterator = log_category for_each = data.azurerm_monitor_diagnostic_categories.example.logs content { enabled = contains(local.example_recovery_vault_diagnostic_categories_enabled,log_category.value) ? true : false category = log_category.value retention_policy { days = 0 enabled = contains(local.example_recovery_vault_diagnostic_categories_enabled,log_category.value) ? true : false } } } dynamic "metric" { iterator = metric_category for_each = data.azurerm_monitor_diagnostic_categories.example.metrics content { enabled = contains(local.example_recovery_vault_diagnostic_categories_enabled,metric_category.value) ? true : false category = metric_category.value retention_policy { days = 0 enabled = contains(local.example_recovery_vault_diagnostic_categories_enabled,metric_category.value) ? true : false } } } }
Unfortunatelly, even this isn't working anymore for me, the data call doesn't seem to be called during plan, which still triggers changes.
Version 3.39.0 of azurerm provider introduced new block enabled_log
under azurerm_monitor_diagnostic_setting
resource.
In my case, transitioning existing log
blocks to enabled_log
blocks fixed the issue and Terraform no longer tries to apply the same changes in every Terraform plan/apply runs.
Version 3.39.1 with enabled_log still has the same issue for me
Version 3.39.0 of azurerm provider introduced new block enabled_log under azurerm_monitor_diagnostic_setting resource. fixed the enabled_log block issue .
But same issue exists for the metric block and it tries to apply the same changes over and over every Terraform plan/apply runs.
Would having an enabled_metrics
improve the situation as some have reported that the new attribute (enabled_log
) has resolved their terraform drift?
For me it keeps happening even when using enabled_log
. My workaround is to ignore the changes with a lifecycle > ignore_changes
block
resource "azurerm_monitor_diagnostic_setting" "ds" {
log_analytics_workspace_id = var.log_analytics_workspace_id
name = var.diagnostic_setting_name
target_resource_id = var.target_resource_id
dynamic "enabled_log" {
for_each = var.logs_categories
content {
category = enabled_log.value
}
}
lifecycle {
ignore_changes = [enabled_log, log, metric] # Workaround for BUG https://github.com/terraform-providers/terraform-provider-azurerm/issues/10388
}
}
The same problem exists for the Synapse workspace.
The diagnostics have no metrics. But a category AllMetrics is created automatically. Every run forces terraform to remove those.
- metric {
- category = "AllMetrics" -> null
- enabled = false -> null
- retention_policy {
- days = 0 -> null
- enabled = false -> null
}
}
Community Note
Terraform (and AzureRM Provider) Version
Affected Resource(s)
azurerm_monitor_diagnostic_setting when applying to Azure SQL DB and Synapse Pool
Expected Behaviour
After running Terraform apply, Terraform apply/plan again should say no changes
Actual Behaviour
After running Terraform plan/apply, will apply the same changes again and again. Below is a pic of what I am seeing everytime I re-run Terraform plan, ie. it is applying those same changes every time, but terraform should only update what has changed net:
Steps to Reproduce
terraform plan
Code
The resource created and the azurerm_monitor_diagnostic_setting used to send resource logs out: Azure SQL DB:
Azure Synapse Pool