SILVER - Issues adding MCS-SILVER-APP-21.DMZ to internal DXCAS Hardware management system

wmhutchison commented 4 days ago

Describe the issue For historic reasons which almost certainly no longer apply, the Platform Operations team did not have any of the HPE servers registered inside the HPE OneView management system. A recent directive is forcing this to change, so the Platform Operations team were asked to re-add all matching servers back into HPE OneView.

This was successful for all servers except MCS-SILVER-APP-21.DMZ. An RFC will be entered to drain this node of all workloads before escalating resolution in case resolution needs the server to be powered off.

Additional context Add any other context, attachments or screenshots

How does this benefit the users of our platform? Timely resolution ensures all hardware used to support our Openshift platforms have full and working hardware monitoring since the back-end tool will auto-generate support tickets thus saving Platform Operations team time doing so manually.

Definition of done

[x] Create an RFC and get approval/comms out to drain the node so that workloads are fully protected moving forward. (CHG0060715)
[x] Create a ticket with DXCAS DCM team regarding the issue to see if they can resolve or not. (RITM0185822)
[ ] ~~If DCM cannot resolve, open a vendor case and obtain next-steps for resolving this.~~
[x] Confirm resolution be adding the node in question to OneView.

wmhutchison commented 3 days ago

RFC for draining the node in question was created , approved and comms posted for June 26th. A separate ticket for DCM also created, who stated that a server reboot will be required by them, so gave them the details of the RFC and will ping them once MCS-SILVER-APP-21.DMZ has been drained of all workloads.

wmhutchison commented 2 days ago

RFC executed for draining the node complete. DCM gave it a go to reset the Administrator password using a local console terminal. They were successful and Platform Ops was both able to access the ILOM again and also setup the node in HPE OneView. Closing ticket as resolved.

BCDevOps / developer-experience

SILVER - Issues adding MCS-SILVER-APP-21.DMZ to internal DXCAS Hardware management system #4927