NearNodeFlash / NearNodeFlash.github.io

View this document https://nearnodeflash.github.io/
Apache License 2.0
3 stars 3 forks source link

StorageProfile webhook: cannot set both combinedMgtMdt and externalMgs #138

Closed roehrich-hpe closed 3 months ago

roehrich-hpe commented 4 months ago

the following error when deploying due to our local customization to use an externalMgs on these systems.

Error from server (Forbidden): admission webhook "vnnfstorageprofile.kb.io" denied the request: cannot set both combinedMgtMdt and externalMgs
make: *** [Makefile:276: deploy] Error 1

Exit Error: exit status 2 (2)
nnf-deploy: error: exit status 2

The storage profile contains:

# Dedicated MGS NIDs for rzvernal
lustreStorage:
  capacityMdt: 64GiB
  capacityMgt: 64GiB
  externalMgs: <NID>@kfi
  combinedMgtMdt: false
roehrich-hpe commented 4 months ago

From Brian: "I had v0.0.9 deployed with a customized Nnfstorageprofiles/placeholder, I ran nnf-deploy deploy to roll out v0.0.10 and saw the above error. I then removed the customization from Nnfstorageprofiles/placeholder, ran nnf-deploy deploy again cleanly, and manually reapplied the changes to Nnfstorageprofiles/placeholder"

roehrich-hpe commented 3 months ago

The kubectl apply is merging the live version of the resource with the new one prior to sending it to the webhook. So if the live resource has this:

  lustreStorage:
    combinedMgtMdt: false
    exclusiveMdt: false
    externalMgs: 10.0.0.1@kfi

And the new one, from nnf-sos/config/examples/nnf_v1alpha1_nnfstorageprofile.yaml has only this:

  lustreStorage:
    combinedMgtMdt: true

They are merged as this:

  lustreStorage:
    combinedMgtMdt: true
    exclusiveMdt: false
    externalMgs: 10.0.0.1@kfi

Thus, tripping the webhook validation.

roehrich-hpe commented 3 months ago

The resource defined in nnf-sos/config/examples/nnf_v1alpha1_nnfstorageprofile.yaml was given the name placeholder because it was not intended to be the actual default profile on a cluster. We mucked that story by setting default: true in that resource.

roehrich-hpe commented 3 months ago

This is resolved by:

The nnf-sos PR: https://github.com/NearNodeFlash/nnf-sos/pull/279 The nnf-deploy PR: https://github.com/NearNodeFlash/nnf-deploy/pull/145 The NearNodeFlash.github.io documentation PR: https://github.com/NearNodeFlash/NearNodeFlash.github.io/pull/140