apache / celeborn

Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
https://celeborn.apache.org/
Apache License 2.0
863 stars 351 forks source link

[CELEBORN-1528][HELM] Use volumeClaimTemplates to support various storage backend #2650

Open ChenYi015 opened 1 month ago

ChenYi015 commented 1 month ago

What changes were proposed in this pull request?

Make helm chart more customizable:

Why are the changes needed?

By using volumeClaimTemplates in StatefulSet, we can support various types of storage backend.

Does this PR introduce any user-facing change?

Yes.

How was this patch tested?

Test locally.

ChenYi015 commented 1 month ago

@pan3793 @RexXiong PTAL, thanks.

ChenYi015 commented 1 month ago

To make it can be reviewed and merged easily, I had created a new PR #2654 to do some values renaming. After it has been merged, I will do a rebase.

lianneli commented 1 month ago

For my opinion, volumeClaimTemplates is needed but host path and empty path is more general for most of situations. It's better to add it as an option instead of the only choice.

ChenYi015 commented 1 month ago

For my opinion, volumeClaimTemplates is needed but host path and empty path is more general for most of situations. It's better to add it as an option instead of the only choice.

@lianneli Thanks for the advice. Though, I think volumes with type hostPath and emptyDir are still supported by [master|worker].volumes. Before, the configuration is like this:

volumes:
  master:
  - mountPath: /mnt/celeborn_ratis
    hostPath: /mnt/celeborn_ratis
    type: hostPath
    capacity: 100Gi
  worker:
  - mountPath: /mnt/disk1
    hostPath: /mnt/disk1
    type: hostPath
    diskType: SSD
    capacity: 100Gi

After, it will be like this:

master:
  volumes:
  - name: celeborn-ratis
    hostPath: 
      path: /mnt/celeborn_ratis
  volumeMounts:
  - name: celeborn-ratis
    mountPath: /mnt/celeborn_ratis

worker:
  volumes:
  - name: disk1
    hostPath:
      path: /mnt/disk1
  volumeMounts:
  - name: disk1
    mountPath: /mnt/disk1

celeborn:
  celeborn.worker.storage.dirs=/mnt/disk1:disktype=SSD:capacity=100Gi

The difference is that we need to configure celeborn.worker.storage.dirs manually if we use hostPath or emptyDir.

ChenYi015 commented 1 month ago

If we want to use volumeClaimTemplates, suppose ssd-storage-class is a valid storage class in your Kubernetes cluster that provided by a cloud provider or defined by yourself that can be used to provide SSD cloud disks dynamically, the configuration will be like as follows:

master:
  volumeClaimTemplates:
  - metadata:
      name: celeborn-ratis
      annotations:
        celeborn.apache.org/disk-type: SSD
        celeborn.apache.org/disk-capacity: 100Gi
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        request:
          storage: 100Gi
        limits:
          storage: 100Gi
      storageClassName: ssd-storage-class
  volumeMounts:
  - name: celeborn-ratis
    mountPath: /mnt/celeborn_ratis

worker:
  volumeClaimTemplates:
  - metadata:
      name: disk1
      annotations:
        celeborn.apache.org/disk-type: SSD
        celeborn.apache.org/disk-capacity: 100Gi
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        request:
          storage: 100Gi
        limits:
          storage: 100Gi
      storageClassName: ssd-storage-class
  volumeMounts:
  - name: disk1
    mountPath: /mnt/disk1

The annotations celeborn.apache.org/disk-type and celeborn.apache.org/disk-capacity are used by Helm to render celeborn.worker.storage.dirs in configmap.

When we install the Helm chart, SSD cloud disks will be dynamically provisioned by cloud providers.

github-actions[bot] commented 2 weeks ago

This PR is stale because it has been open 20 days with no activity. Remove stale label or comment or this will be closed in 10 days.