Closed NecromuncherDev closed 6 years ago
@ThatBeardedDude, did you take a look at our S3 Source Example and Ceph Source Example? Let us know if that's what you are talking about
I took a look at both, both failing at some point for some reason. The Ceph Nano (which was the more promising of the two, for me) failed as early as I got to set up the notebook pod. It failed spinning up due to an error related to being unable to get replication controller...
Nevertheless, I do belive that the very jars related to connecting to an object storage might be of help.
i think this is a good suggestion @ThatBeardedDude, and we should consider adding the jars. depending on how you are using these images, spark provides a very convenient method for injecting jar files into your cluster.
for example, if you have a driver application that will speak to the spark cluster produced by these images, you could pass the --packages <some package>
command to the spark-submit
and that would instruct the cluster to download those files into the executors (workers). something like this might work in the interim, depending on your use case.
With growing interest in using object storage (ceph, aws, minio, etc.) via s3a/n api-s, rises the question of implementing those features into this image.
Will this (including the appropriate jars in this image) be a desired thing?