Open hiboyang opened 2 years ago
Can you develop the use case? For me applications should only deployed with manifest. I don't see a good practice using a rest api to deploy an application but I may be wrong
K8S API server itself is a webserver. Manifest can be POSTed to the server. Any operator for that matter indirectly is a path handler.
The use case is providing Spark as a service, something like GCP DataProc (https://cloud.google.com/dataproc). People could deploy such a service inside their own Kubernetes environment. Then Spark users could use curl (or some other client tool) to submit Spark application to the service. Posting Manifest to K8S API server is an option, the downside is it is too complicated for Spark users, who should not need to learn all those K8S details.
Posting Manifest to K8S API server is an option, the downside is it is too complicated for Spark users, who should not need to learn all those K8S details.
The K8S details can be wrapped in a small go, python or shell code. A web-server initially may look simpler, but with security, high-availability, recovering from either web-server or k8s spark operator failure, features it will be quite complex. Managing RBAC for submit, delete and translating it into K8S RBAC also will not be easy.
Using client-side wrapper is far easier is my personal opinion.
FYI - Apache Livy which partly had similar goals in the earlier Big data world did not gain much traction and continues to be in incubation.
This could also be a different project, if you want to provide spark as a service, you can write a service that will create and manage manifests in your k8s infrastructure if you choose to use k8s for the infrastructure of your saas. Probably easier to do it this way, I don't think the operator and this api have mush to share.
Yes, another option is to create the REST service as a different project. Let's see how it goes. Thanks folks for the feedback!
We use airflow and that has an api. It has a spark on k8s airflow operator which works with this project. You define dags that have a task for submission and monitoring. The dags let you parameterize the spark app yaml. We also started using kubeflow, which has kubeflow pipelines and the spark on k8s operator. Kubeflow pipelines are very similar to airflow.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
We want to have a REST server to submit Spark application. It will not be much work adding a REST server inside Spark operator to accept request for people to submit Spark application. How do people like the idea?