[EXPERIMENTAL] This repo includes deployment instructions for running HDFS/Spark inside docker containers. Also includes spark-notebook and HDFS FileBrowser.
What do you think about adding glusterfs as an alternative file system in swarm stack?
Dealing with HDFS is a lot of pain and namenode is a single point of failure (it also has its data in local volume, so it should always be replicated in a same node).
In our lab, we are now considering using alluxio + glusterFS or any other alternatives to HDFS
What do you think about adding glusterfs as an alternative file system in swarm stack? Dealing with HDFS is a lot of pain and namenode is a single point of failure (it also has its data in local volume, so it should always be replicated in a same node). In our lab, we are now considering using alluxio + glusterFS or any other alternatives to HDFS