Open slzmruepp opened 1 year ago
I'm having a similar issue when I try to upgrade a cluster from v1.22
to v1.23
without automatic_channel_upgrade
.
My AKS configuration is pretty minimal:
resource "azurerm_kubernetes_cluster" "k8s" {
resource_group_name = data.azurerm_resource_group.resource_group.name
name = data.azurerm_resource_group.resource_group.name
dns_prefix = local.cname
location = data.azurerm_resource_group.resource_group.location
kubernetes_version = var.k8s_version
role_based_access_control_enabled = true
default_node_pool {
name = "default"
min_count = var.k8s_min_agent_count
max_count = var.k8s_max_agent_count
enable_auto_scaling = true
vm_size = var.k8s_agent_size
os_disk_size_gb = 50
}
identity {
type = "SystemAssigned"
}
tags = {
terraform = true
}
}
As you can see, I don't set any orchestrator_version
explicitly in the default_node_pool
configuration.
According to the docs: orchestrator_version: (Optional) Version of Kubernetes used for the Agents. If not specified, the default node pool will be created with the version specified by kubernetes_version
.
In my case, it should be v1.23
then.
If I inspect the payload of the PUT request made by terraform to the Azure API with TF_LOG=trace
though, I see that kubernetes_version
and orchestrator_version
are different (I omitted all the unrelated fields):
{
"properties": {
"kubernetesVersion": "1.23",
"agentPoolProfiles": [
{
"currentOrchestratorVersion": "1.22.15",
"orchestratorVersion": "1.22"
}
]
}
}
And I get the same error from the API:
managedclusters.ManagedClustersClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: Code="NotAllAgentPoolOrchestratorVersionSpecifiedAndUnchanged" Message="Using managed cluster api, all Agentpools' OrchestratorVersion must be all specified or all unspecified. If all specified, they must be stay unchanged or the same with control plane. For agent pool specific change, please use per agent pool operations: https://aka.ms/agent-pool-rest-api"
I'm wondering if this could be related to #18130
This issue has also affected us, and it seems that the behavior has suddenly changed with the new version of the AzureRM Provider released on Friday. It's possible that the new version of the provider has introduced changes or regressions that impact the behavior many of you have experienced.
I recommend that everyone retest their configurations with the new provider version to determine if the issue persists or has been resolved. It would be appropriate for Azure engineers to comment on this issue and any subsequent changes in the provider (if any).
Is there an existing issue for this?
Community Note
Terraform Version
1.3.7
AzureRM Provider Version
3.46.0
Affected Resource(s)/Data Source(s)
azurerm_kubernetes_cluster_node_pool
Terraform Configuration Files
Debug Output/Panic Output
Expected Behaviour
It should be possible to update tags on aks cluster and associated node pool without failure.
Actual Behaviour
We run a private aks cluster. It consists of two node pools, user and system. We enabled auto_channel_upgrade. There is an azure bug which is flaky that the azure gui under node pools shows different versions after an upgrade. For example, system nodepool is 1.24.9, user nodepool shows still 1.24.6. But when trying to upgrade the user nodepool it shows the "current" version is already 1.24.9. Also kubectl get nodes shows a consistent picture of all nodes are on 1.24.9. However, obviously, the API terraform uses, gives back the wrong version and in former runs, we did not have the orchestrator_version lifecycle enabled on the nodepool, so it tried to upgrade the nodepool version to 1.24.9. This used to work from time to time, funnily usually in the mornings 8-17 CET it did not work, but after that the run worked sometimes.
We then changed to:
but also the runs with orchestrator_version ignore enabled, will fail. The nodepool and cluster are only upgrading tags:
The error above is from run with orchestrator_version lifecycle enabled.
Steps to Reproduce
Create cluster and nodepool with auto channel upgrade stable Enable Kubernetes version and orchestrator version lifecycle ignore on cluster and nodepool. Wait for azure to run a stable version upgrade. Try to reapply terraform by only updating tags.
Important Factoids
No response
References
No response