Closed houssems closed 11 months ago
Hi @houssems, could you add the code you used to create the cluster to make sure that we have accurate repro steps for this issue? Thanks!
Hi @lblackstone, I've updated the config files
Had similar problem, I used this to add additional agent pool https://www.pulumi.com/registry/packages/azure-native/api-docs/containerservice/agentpool/. I think it's an arm issue and not and Pulumi issue, i think this stem from the time aks could only have one agentpool, but not 100% certain
@lblackstone we have the same issue
If you create an AKS cluster with the ManagedCluster resource and change elements in the AgentPoolProfiles Pulumi wants to recreate the cluster.
Any change that Azure will say that you have to recreate the nodepool, pulumi wants to recreate the cluster. IMO Pulumi should not recreate the cluster on nodepool changes.
Another note: if you add a new nodepool ex newnodepool
, and remove config for oldnodepool
. Pulumi will not detect this change correctly. Instead of deleting oldnodepool
and creating newnodepool
it will try to update oldnodepool
with newnodepool
settings. My guess is that pulumi diffs the array without taking the nodepoolname into context, which would be the better behaviour.
versions: pulumi-cli v3.37.2
<ItemGroup>
<PackageReference Include="Pulumi" Version="3.*" />
<PackageReference Include="Pulumi.Azure" Version="5.14.0" />
<PackageReference Include="Pulumi.AzureAD" Version="5.26.1" />
<PackageReference Include="Pulumi.AzureNative" Version="1.*" />
<PackageReference Include="Pulumi.Kubernetes" Version="3.20.2" />
<PackageReference Include="Pulumi.Tls" Version="4.6.0" />
</ItemGroup>
We faced similar issues on AKS recreation when we refreshed cluster state. Since Azure itself reorganizes the state of some variables, which include the subnets, if you deploy they internally into the AKS template. Pulumi is now trying to recreate the whole cluster. This includes some Agent Pools, as Azure Spot Pools are forced by the internal Azure system to be tagged in a special way.
Another common issue is adding node pools. When you refresh the base cluster setup, it's doing the wrong correlation between the node pool used during cluster creation and any of the new ones. This will lead to a node pool replacement, and then a deletion.
My initial reading of this issue thread indicates there might be two separate problems being encountered:
maxPods
within agentPoolProfiles
which is marked as forceNew: true
when changing the maxPods property.agentPoolProfiles
changing during a refresh as it's not stored as ordered in Azure, but it's really a set.This appears to be behaving as defined by the documenation:
If you want to change various immutable settings on existing node pools, you can create new node pools to replace them. One example is to add a new node pool with a new maxPods setting and delete the old node pool.
Therefore, as I stands this appears to be working entirely correctly as attempting to update this number would result in a failed update.
While technically correct, the non-ideal behaviour here is that we don't actually need to re-create the whole cluster, but could just recreate the node pool instead. This would happen if all node pools were modelled as independent resource but in this instance, system node pools are not able to be managed in this way.
We should ensure that users can:
agentPoolProfiles
with their new desired settings.The simplest way of ensuring this might be to disable the forceNew: true
on all agentPoolProfiles
fields.
Provider changes:
forceNew: true
on all ManagedClusterAgentPoolProfile
properties.agentPoolProfiles
has changed:
Open Questions:
agentPoolProfiles
orderingThe specification doesn't appear to give any hint on which field is the unique identifier so we can infer that the array should be treated as a set.
On the agentPoolProperties
it defines "x-ms-identifiers": []
- it's missing name
as the identifier.
Reference: spec for x-ms-identifiers
The clearest route here would be to:
x-ms-identifiers
in our metadata & diffing strategy.Since we've disabled the forced-recreation of the AKS cluster on the agentPoolProfiles in #2774 the core of this issue is now resolved. I've logged a follow-up issue to also improve how we handle diffing arrays where there's a known key for items:
I've also logged a separate issue to investigate re-enabling sub-property resource recreation correctly:
What happened?
I've added scaling for a pool in AKS. When I launch the command pulumi up, it recreate the cluster from zero.
Steps to reproduce
update cluster config with any anything, the cluster got recreated. For It was the scaling of node pool.
and config file was
to
update only node pool config in Azure
Actual Behavior
Cluster got deleted and recreated
Versions used
v3.34.1
Additional context
No response
Contributing
Vote on this issue by adding a 👍 reaction. To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).