ironcore-dev / metal-operator

Kubernetes operator for automating bare metal server discovery and provisioning
Apache License 2.0
9 stars 4 forks source link

Retry with `ForcePowerOff` if graceful shutdown times out #129

Closed defo89 closed 2 weeks ago

defo89 commented 2 weeks ago

Proposed Changes

defo89 commented 2 weeks ago

hmm, need to revisit. For Dell it has worked, but on Lenovo I get reset type 'PushPowerButton' is not supported by this service

2024-09-17T12:56:20.219583196Z 2024-09-17T12:56:20Z ERROR   Reconciler error    {"controller": "server", "controllerGroup": "metal.ironcore.dev", "controllerKind": "Server", "Server": {"name":"node006-bb00-system-0"}, "namespace": "", "name": "node006-bb00-system-0", "reconcileID": "1e290486-c86f-4df4-90a5-5c55ab438ae1", "error": "failed to ensure server state transition: failed to ensure server power state: failed to power off server: failed to reset system to power on state: reset type 'PushPowerButton' is not supported by this service"}
stefanhipfel commented 2 weeks ago

lgtm as well. we could also use metal.ironcore.dev/operation annotation to force a power off. But maybe it is good that this is happening automatically.

defo89 commented 2 weeks ago

lgtm as well. we could also use metal.ironcore.dev/operation annotation to force a power off. But maybe it is good that this is happening automatically.

good point. I'd see metal.ironcore.dev/operation annotation as a troubleshooting tool to perform the operation and also ability to force power off for users that will not use this --enforce-power-off option.