I am reading through the Getting Started for Kubernetes documentation with the lens of a Kubernetes user attempting to install the Spark-Rapids plugin. Here are my comments as I go through it:
[ ] The Kubernetes documentation doesn't read anything like a normal Kubernetes document. There is no helm chart and no deployment yaml here. It took me a while to realize that Spark is going to access the cluster and the user creates the things by running Spark itself. A typical Kubernetes deployment fills out some yaml/config and installs into the cluster and then things talk to the running application. This isn't a slight on the documentation, but an indication that the perspective is odd for a Kubernetes developer and documentation with a "how does Spark work with Kubernetes" overview would be nice, even if it is pointing to Spark documentation.
[ ] The startup could be so much more condensed and approachable if the user wasn't building their own docker images. I found a Databricks image being published, but couldn't find any docker image indicating it would work with Kubernetes. I found #4158 referencing this, but no real plans. This would be so much easier for people to kick the tires if they didn't have to build this image.
[ ] Layout seems odd. There is a Running Spark Applications in the Kubernetes Cluster header and it talks about a simple test job submission, then running and interactive shell, etc. Then the next major section is Running Spark Applications using Spark Operator which introduces a new way to talk to the cluster after the user has invested time into the original way. An introduction to the ways to run Spark on Kubernetes before these sections and then instructions on how to do the simple job, interactive shell, etc for both would be useful. Then the user can figure out the method that will work best for them and work through a simple example using that method.
Ultimately, I'd like to see a TLDR with a very simple get Spark running on Kubernetes at the top and then the added detail with two deployment methods and how it work after. I think a public docker image would make this very brief.
I went through the Kubernetes documentation as well, and here are some additional thoughts:
[ ] As a new user for Kubernetes, I wish there are more clear references to instructions on setting up a Kubernetes cluster. In the Prerequisites section, there is a link to "how to install a Kubernetes cluster with NVIDIA GPU support", but the link takes me to a "NVIDIA Cloud Native Technologies" overview page, and I could hardly find any resources for installing Kubernetes clusters there.
I am reading through the Getting Started for Kubernetes documentation with the lens of a Kubernetes user attempting to install the Spark-Rapids plugin. Here are my comments as I go through it:
Ultimately, I'd like to see a TLDR with a very simple get Spark running on Kubernetes at the top and then the added detail with two deployment methods and how it work after. I think a public docker image would make this very brief.