apache / cloudstack

Apache CloudStack is an opensource Infrastructure as a Service (IaaS) cloud computing platform
https://cloudstack.apache.org/
Apache License 2.0
2.13k stars 1.12k forks source link

Feature Idea: 'Host/Cluster Waiting For Maintenance' Mode #10019

Open btzq opened 1 day ago

btzq commented 1 day ago
ISSUE TYPE
COMPONENT NAME
Host? Cluster? Im not sure
CLOUDSTACK VERSION
NA
CONFIGURATION
OS / ENVIRONMENT
SUMMARY

Current Capability

CloudStack currently offers a 'Maintenance' Mode, which facilitates the live migration of all VMs from a host and removes the host from the cluster for maintenance.

Proposed Feature: "Waiting for Maintenance" Mode

The proposed "Waiting for Maintenance" Mode introduces a preparatory state that addresses scenarios where live migration is impractical or impossible. This feature would enable gradual decommissioning or maintenance while avoiding service disruption.

General Idea of How It Might Work:

1. Operator Responsibilities:****

2. CloudStack Responsibilities:

This is actually a similar process as how AWS Cloud does it: https://aws.amazon.com/maintenance-help/

Use Cases

Scenario 1: Decommissioning an Old Compute Cluster

Problem:

Scenario 2: Maintenance of GPU Clusters with GPU Passthrough

Problem:

STEPS TO REPRODUCE
NA
EXPECTED RESULTS
Refer to Above
ACTUAL RESULTS
Not able to facilitate smooth decomissioning of servers for compute where live migration is not possible.
DaanHoogland commented 18 hours ago

good idea @btzq , I wonder if we need a new state as there is already "prepare for maintenance". this might be overloaded, i.e. set manually instead of automatically. let's investigate.

DaanHoogland commented 18 hours ago

cc @nvazquez any opinion?