BCDevOps / developer-experience

This repository is used to track all work for the BCGov Platform Services Team (This includes work for: 1. Platform Experience, 2. Developer Experience 3. Platform Operations/OCP 3)
Apache License 2.0
8 stars 17 forks source link

SILVER - Issues adding MCS-SILVER-APP-21.DMZ to internal DXCAS Hardware management system #4927

Closed wmhutchison closed 2 days ago

wmhutchison commented 4 days ago

Describe the issue For historic reasons which almost certainly no longer apply, the Platform Operations team did not have any of the HPE servers registered inside the HPE OneView management system. A recent directive is forcing this to change, so the Platform Operations team were asked to re-add all matching servers back into HPE OneView.

This was successful for all servers except MCS-SILVER-APP-21.DMZ. An RFC will be entered to drain this node of all workloads before escalating resolution in case resolution needs the server to be powered off.

Additional context Add any other context, attachments or screenshots

How does this benefit the users of our platform? Timely resolution ensures all hardware used to support our Openshift platforms have full and working hardware monitoring since the back-end tool will auto-generate support tickets thus saving Platform Operations team time doing so manually.

Definition of done

wmhutchison commented 3 days ago

RFC for draining the node in question was created , approved and comms posted for June 26th. A separate ticket for DCM also created, who stated that a server reboot will be required by them, so gave them the details of the RFC and will ping them once MCS-SILVER-APP-21.DMZ has been drained of all workloads.

wmhutchison commented 2 days ago

RFC executed for draining the node complete. DCM gave it a go to reset the Administrator password using a local console terminal. They were successful and Platform Ops was both able to access the ILOM again and also setup the node in HPE OneView. Closing ticket as resolved.