apache-spark-on-k8s / spark

Apache Spark enhanced with native Kubernetes scheduler back-end: NOTE this repository is being ARCHIVED as all new development for the kubernetes scheduler back-end is now on https://github.com/apache/spark/
https://spark.apache.org/
Apache License 2.0
612 stars 118 forks source link

Still maintained? #625

Open elgalu opened 6 years ago

elgalu commented 6 years ago

Is this going to be still maintained?

I see duplicated effort on Spark 2.3 and also on https://github.com/GoogleCloudPlatform/spark-on-k8s-operator

liyinan926 commented 6 years ago

We stopped development in this fork, and have been focusing on upstreaming since late last year. We generally encourage people to use the upstream 2.3 release instead of this fork. https://github.com/GoogleCloudPlatform/spark-on-k8s-operator is a separate project for making running Spark on Kubernetes idiomatically.

witten commented 6 years ago

According to these docs, upstream Spark on Kubernetes does not include support for things like PySpark, and those docs also indicate that new Spark on Kubernetes features will be "incubated" in apache-spark-on-k8s. So that leaves this project as the only option for use cases not yet supported by upstream. Do you recommend that anyone still using your fork switch to consuming unreleased changesets in branch-2.2-kubernetes? Or do you really plan to halt all development?

In any case, I'd recommend updating the README to reflect the current state of the project, so it's clear that new or existing users should steer clear and use upstream instead if that's your intent.

harbesc commented 6 years ago

Any update? We are looking forward to using secure HDFS with kerberos and other improvements not in the latest release!

ifilonenko commented 6 years ago

I am working on secure HDFS support and SparkR. I just put up a PR for PySpark :) in PR 21092.