I read a few articles which confirmed my first perception of that state of the opensource market: In simple words, you today have only two serious options to deploy spark to kubernetes, either the spark official integration, or Google's kubeflow spark operator. It seems like I will have to experiment both.
Articles / resources I found on the deploying with either of those two options:
The first question i have in mind after this quick investigation is : Ok spark works on a filesystem, and i don't see anywhere how this filesystem is provisioned ?
Here are some references to deploy a spark stack to kubernetes:
The first question i have in mind after this quick investigation is : Ok spark works on a filesystem, and i don't see anywhere how this filesystem is provisioned ?
It seems that hdfs is related to S3: