Support local PVs for datanodes

deathcoder commented 6 years ago

Hi, I created a repo where i replaced hostpaths with persistentVolumeClaims, i just tested this new configuration with minikube but i think it should work with a real cluster too.

You can find the changes here: https://github.com/deathcoder/kubernetes-HDFS/commit/163ae1c6ab4a684f133850d1855ec816a911b5ce

i read here https://github.com/apache-spark-on-k8s/kubernetes-HDFS/blob/master/charts/hdfs-simple-namenode-k8s/README.md that you plan to switch to persistentVolumes in the future, so please let me know if you think this is enough for a pr, or if i have to do more changes to merge this.

kimoonkim commented 6 years ago

@deathcoder, welcome to the project. And thanks for looking into this particular issue.

I looked at your change. The simple namenode part looks reasonable.

However, the scope of the datanode part is actually more complicated. Currently, the datanode chart is using the K8s DaemonSet to launch many datanode pods, one pod per cluster node. DaemonSet creates those pods using random pod names, like hdfs-datanode-u8jka, where the last component of the name changes everytime an old pod crashes and a replacement pod launches. Since PVCs are associated with old pod names, new pods will fail to find existing PVCs. This means datanode pods would lose their data whenever they restart. To solve this issue, the datanode chart will have to switch to the K8s StatefulSet. But this means we will not be able to start one datanode per cluster node automatically, which comes with the DaemonSet. So this is a bit more complicated.

Another challenge with the datanodes is that we probably want the upcoming local persistent volumes. (See link here). So that we can still support data locality.

Hope this makes sense. I wonder if you can think more about these issues and play further with your prototype to incorporate StatefulSet and local PVs for datanodes.

maver1ck commented 6 years ago

Hi @kimoonkim Could you tell me what's the reason we're using DaemonSet for Datanodes ? Moving to Statefulset sounds reasonable.

apache-spark-on-k8s / kubernetes-HDFS

Support local PVs for datanodes #46