Open afritzler opened 2 months ago
To kick of some discussion here is a short sketch, which more or less is a trimmed down version of the NodeMaintenance proposal.
We introduce the ServerMaintenance
custom resource to signal maintenances on Servers.
// +enum
type ServerMaintenanceStage string
const (
// Idle announces a maintenance.
Idle ServerMaintenanceStage = "Idle"
// InMaintenance shuts down servers, potentially releasing server claims.
InMaintenance ServerMaintenanceStage = "InMaintenance"
// As long as no other maintenance is applied to a server, they are made ready again.
Complete ServerMaintenanceStage = "Complete"
)
type ServerMaintenance struct {
...
Spec ServerMaintenanceSpec
Status ServerMaintenanceStatus
}
type ServerMaintenanceSpec struct {
// ServerSelector selects servers for this maintenance.
ServerSelector *v1.ServerSelector
// The order of the stages is Idle -> InMaintenance -> Complete.
// The default value is Idle.
Stage ServerMaintenanceStage
// Reason for the maintenance.
Reason string
}
As soon as the Stage is set to InMaintenance
all selected Servers
will have their .spec.power
set to Off
.
The respective ServerClaims
, if any, can either be re-bound or kept.
Likely, such a setting needs to be added to a ServerClaim
.
On completion all selected Servers
will have .spec.power
to On
, when no other ServerMaintenance
with stage InMaintenance
selects a Server
.
Completed ServerMaintenances
need to be garbage collected eventually.
The cloud-provider-manager-metal can have a watch on ServerMaintenances
and forward information about ServerMaintenances
into it's Kubernetes cluster.
Do we need some mechanism to block/deny maintenances on Servers
?
@Nuckal777 thanks for input!
I'm pretty sure, that we must not power off servers in such an implicit manner. For instance, firmware updates, especially which require server reboot, should be also considered as maintenance. Hence, it might not be reasonable to force server power off.
Apart from that, any of controllers which work with Server
CR would have to loop over ServerMaintenance
objects to find whether server is in maintenance or not. This could be solved by the additional field or condition in servers status, however, it might cause data race between controllers as soon as both spec (to switch power) and status (to set maintenance state) have to be updated.
From my perspective, maintenance state should be represented in .spec
of the Server
CR:
From my perspective, maintenance state should be represented in .spec of the Server CR
There is an issue with that design: When 2 controllers powered off a Server
and one controller is finished, the finished controller will set the power state to on and the second controller to off. The controller start fighting over that field. metal3 uses a similar design for that reason. Further reasoning is in the declarative node API proposal.
For instance, firmware updates, especially which require server reboot, should be also considered as maintenance.
Why couldn't a the firmware upgrade handling create a ServerMaintenance
and delete it after finishing?
Apart from that, any of controllers which work with Server CR would have to loop over ServerMaintenance objects to find whether server is in maintenance or not.
I agree. This would be a disadvantage.
I like the idea of having an own resource initiating the Maintenance
of a Server
. Regarding the ServerClaim
binding: Power management and other updates will only happen, when the Server
is in a Reserved
state - meaning once a Server
is in Maintenance
, only the MaintenanceReconciler
should perform operations on the Server
e.g. creating a BootConfiguration
to boot into the BIOS/Firmware update mode etc.
There is an issue with that design: When 2 controllers powered off a Server and one controller is finished, the finished controller will set the power state to on and the second controller to off. The controller start fighting over that field. metal3 uses a similar design for that reason. Further reasoning is in the declarative node API proposal.
As @afritzler mentioned, it should be done in such a way, that particular controller may manage server's power state only if server is in particular state. Like if server is in Reserved state, then only ServerClaim controller could power it on/off, if server is in Maintenance state, then only Maintenance controller could power it on/off and etc.
Why couldn't a the firmware upgrade handling create a
ServerMaintenance
and delete it after finishing?
It definitely could, but I was pointing that is not necessarily need to power off the server in maintenance state.
Summary
Describe a concept on how we can put a
Server
intoMaintenance
mode. Also evaluate what it would mean to move it back into an operational state.The Kubernetes community has a KEP extending the
Node
API by a declarative approach in handlingNode
maintenance: https://github.com/kubernetes/enhancements/issues/4212We should see how those concepts can be applied for our
Server
objects.Implications to consider:
How is the maintenance state influences the behaviour of controllers sitting on top of the
metal
API e.g.:Expected outcome:
metal
types