Alluxio / alluxio

Alluxio, data orchestration for analytics and machine learning in the cloud
https://www.alluxio.io
Apache License 2.0
6.84k stars 2.94k forks source link

Deploying on k8s not easy enough #15491

Open ssz1997 opened 2 years ago

ssz1997 commented 2 years ago

I feel like deploying Alluxio on k8s currently is not easy enough for first-time users using helm chart in following aspects:

  1. Journal by default is using PV/PVC because we want them to persist. However we don't provide a default PV template in the codebase.
  2. Short-circuit by default is also using PV/PVC and we don't have a PV template.

I think these two are the major barriers for first-time users. They have to set up the PVs or adjust the configs, during which they may have to learn about what journal and short-circuit are. Eventually they may still need to learn, but learning curve is smoother.

We have pretty detailed documentation on these two configs, but that means there are more stuff for them to read and understand, which "explains well to experienced users" and "raise the barrier for first-time users" at the same time.

So I'm proposing something to make helm chart easier to deploy for first-time users. Ideally they only need to set UFS in the config file, then a helm install command should be able to start Alluxio cluster. These are all under the default settings:

  1. For journal, create the PV with the cluster with the default size (1Gi). We create the PVC of default size 1Gi, so I don't see a reason why we don't also create a PV of default size 1Gi.
  2. Change the default short-circuit volume type to hostPath, and/so no need to configure pv/pvc. I don't think we have to use a persistent storage for short-circuit, because if the worker pod is down, there's no read/write operations anyways.

Any discussion is welcome.

ssz1997 commented 2 years ago

@LuQQiu @jiacheliu3 @ZhuTopher Let me know what you all think, and please correct me if I overlook something.

ZhuTopher commented 2 years ago

Thanks Shawn for making this git issue! I agree pretty strongly with what you brought up here as I went through the same struggles on-boarding myself with this topic 😅

  1. I definitely think the default journal PV (hostPath volume on the Master node(s)) can be included as part of the Helm chart itself. However, I think we should make it an "opt-in" field in values.yaml rather than a default. The default would still be providing nothing but the PVC and requiring end-users to match their PV definitions with the PVC in values.yaml.

    • The logic behind this is that users who just want to try it can use the recommended values where we will include this opt-in field
    • Users who actually have configured their own PVs will experience no change
  2. I'm actually not very clear on the different intended use-cases for short-circuit in k8s. For example,

    shortCircuit:
    enabled: true
    policy: local

    I don't know what scenarios this policy ("hostname introspection") is viable.

As for the uuid-based policy, directly using hostPath would work but so would including the local PV definition in the Helm chart which I think I might prefer.

LuQQiu commented 2 years ago

Thanks @ssz1997 , I didn't expect that i cannot follow the kubernetes doc blindly I expect

helm repo add alluxio-charts https://alluxio-charts.storage.googleapis.com/openSource/2.8.0
helm install alluxio -f config.yaml alluxio-charts/alluxio

directly works and can create the simplest Alluxio cluster for me without any tunning (and of course, the tons of tuning is just too much information)

Would want to have a simplest version of kubernetes helm chart quick start and then have advance section about how to tune and change based on needs

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions.

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions.