netascode / terraform-aci-nac-aci

Terraform Cisco ACI Nexus-as-Code Module
https://registry.terraform.io/modules/netascode/nac-aci/aci
Apache License 2.0
17 stars 14 forks source link

Maintenance groups #60

Closed jorgenspange closed 1 month ago

jorgenspange commented 4 months ago

Maintenance groups does not seem to work, when applying them they show up as failed because no version is specified.

Also using them fails: image image

Here is my config:

---
apic:
  node_policies:
    update_groups:
      - name: odd
      - name: even
---
apic:
  node_policies:
    nodes:
      - id: 2101
        pod: 1
        role: leaf
        update_group: odd
      - id: 2102
        pod: 1
        role: leaf
        name: LabDrLeaf2102
        update_group: even
      - id: 2901
        pod: 1
        role: spine
        update_group: odd
danischm commented 4 months ago

This is kind of expected as the update itself is not supposed to be triggered from Nexus-as-Code, therefore it would not make sense to set the target version with Nexus-as-Code either. This is typically done either via the GUI or some other automation. The "overlap" validation error seems to be unrelated. Do you have any other groups preconfigured?

jorgenspange commented 4 months ago

@danischm agreed, that makes sense, was only thinking if that was a symptom for the other problem. These are the only two groups I have defined: image

andbyrne commented 4 months ago

We really only need the aci_maintenance_group module. Creation of the objects in that module results in the APIC automatically creating the objects that we also create in the aci_firmware_group module.

I believe in this case, we could deprecate the aci_firmware_group module altogether. You can verify the behaviour I described above by passing the following options and then checking the object store browser to confirm that the firmware objects have also been created:

modules:
  aci_firmware_group: false

Note: This is not the source of the overlap issue. I have seen that also in the past, but could never pin down the root cause.

jorgenspange commented 4 months ago

@andbyrne Thanks! Where am I supposed to specify this?

alexanderdeca commented 4 months ago

@andbyrne Thanks! Where am I supposed to specify this?

My educated guess would be modules>nac-aci>aci_node_policies.tf comment the existing module and add modules: aci_firmware_group: false

Don't mind me follow the experts :)

danischm commented 4 months ago

It can be anywhere. You can for example create a modules.yaml file with the mentioned content in the data/ directory.

jorgenspange commented 4 months ago

@andbyrne Tested disabling the firmware group module as you described and it does like it's not needed. As you say the overlap problem is still there.

andbyrne commented 3 months ago

I've found the root cause. When you go through an upgrade process, the APIC will try to recreate the fabricNodeBlk objects with a name in the format blk<node>-<node> whereas this module creates them with a name in the format <node>.

The following update to the terraform-aci-maintenance-group module appears to resolve the issue. It doesn't appear that the equivalent fabricNodeBlk objects in the terraform-aci-firmware-group module need to be changed. I'll raise a PR once I have verified the changes.

resource "aci_rest_managed" "fabricNodeBlk" {
  for_each   = toset([for id in var.node_ids : tostring(id)])
  dn         = "${aci_rest_managed.maintMaintGrp.dn}/nodeblk-blk${each.value}-${each.value}"
  class_name = "fabricNodeBlk"
  content = {
    name  = "blk${each.value}-${each.value}"
    from_ = each.value
    to_   = each.value
  }
}