datacenter / ACI-Pre-Upgrade-Validation-Script

A script to run validations to detect potential issues that may cause an ACI fabric upgrade to fail
https://datacenter.github.io/ACI-Pre-Upgrade-Validation-Script/
Apache License 2.0
43 stars 27 forks source link

NewValidation: CSCwf58763 #164

Open welkin-he opened 1 month ago

welkin-he commented 1 month ago

(use upvote :thumbsup: for attentions)

Validation Type

[x] - Bug

What needs to be validated

fabricRsDecommissionNode mo with node-id match any existing spine switches, while upgrading to 5.2(3e) and above unitl 6.0(3d)

Why it needs to be validated

The stale mo can trigger spine get removed from coop adjacency list for all leaf switches, while keep the spine participating forwarding decision without the latest insight of the fabric.

Additional context

Add any other context about the feature request here.

monrog2 commented 1 month ago

166 asked to check for fabricRsDecommissionNode in general.

CSCwf58763 specifically says to flag for DNs for node IDs that exist, but may be currently in a different pod compared to its current location.

"In this example, Node 401/402 are Spines in Pod 2 but they are stale as if they are in Pod 1.

node-1# moquery fabricRsDecommissionNode | egrep "^dn|^_svc"
_svc                : policydist
dn                  : uni/fabric/outofsvc/rsdecommissionNode-[topology/pod-1/node-402]
_svc                : policydist
dn                  : uni/fabric/outofsvc/rsdecommissionNode-[topology/pod-1/node-401]
_svc                : policymgr
dn                  : uni/fabric/outofsvc/rsdecommissionNode-[topology/pod-1/node-401]
_svc                : policymgr
dn                  : uni/fabric/outofsvc/rsdecommissionNode-[topology/pod-1/node-402]

"

welkin-he commented 1 month ago

cool, thanks for putting two validation together. Here is the initial commit for peer view. https://github.com/datacenter/ACI-Pre-Upgrade-Validation-Script/tree/welkin-issue-164