fluid-cloudnative / fluid

Fluid, elastic data abstraction and acceleration for BigData/AI applications in cloud. (Project under CNCF)
https://fluid-cloudnative.github.io/
Apache License 2.0
1.58k stars 949 forks source link

[FEATURES] Support Pod-Specific Scaling Down in Fluid using Advanced StatefulSet #4193

Open cheyang opened 6 days ago

cheyang commented 6 days ago

Background

Fluid provides elastic scaling capabilities for distributed caching, which is crucial for on-demand cache usage and cost reduction. However, currently, Fluid's cache scaling down relies on StatefulSet, which lacks the ability to scale down specific Pods. We can leverage OpenKruise's Advanced StatefulSet to implement this capability. However, this should be done without a strong dependency on OpenKruise and should be enabled via a flexible mechanism like a feature switch.

features:
- AdvancedStatefulSet=true

Objectives

  1. Enable the capability to scale down specific Pods using a feature switch.
  2. Implement a proof-of-concept using AlluxioRuntime, defining the entire scaling-down process:
    1. Clear cache on the specified node and bring the worker offline.
    2. Perform the scaling down.
  3. Ensure backward compatibility for this capability.

Reference Materials