apache-spark-on-k8s / spark

Apache Spark enhanced with native Kubernetes scheduler back-end: NOTE this repository is being ARCHIVED as all new development for the kubernetes scheduler back-end is now on https://github.com/apache/spark/
https://spark.apache.org/
Apache License 2.0
612 stars 118 forks source link

Support running integration tests in CircleCI / TravisCI #487

Open ash211 opened 7 years ago

ash211 commented 7 years ago

I was chatting with @mccheah and @jlz27 about what it would take for us to run integration tests in CircleCI, and Jason has a setup that works, minus one problem.

Roughly the setup is to use minikube in the --vm-driver=none mode and modify docker bridge settings so any docker images running in the circle container (Spark the repo doesn't have this, but we have many internal repos using docker-compose for integration testing) can communicate with the launched kubernetes pods.

The catch though is that kubelet attempts to modify the kernel settings of the host it runs on to fit its preferences, but that doesn't work in CircleCI. See https://github.com/kubernetes/kubernetes/issues/50110 for details of that.

The benefits of getting this working would be:

mccheah commented 7 years ago

I think the biggest benefit actually is just to remove the need to install a virtual machine on the upstream Spark's Jenkins nodes. Running Minikube on bare metal is much lighter weight than running Minikube inside a virtual layer.

dimberman commented 6 years ago

+1 this would also be valuable for the airflow-kubernetes integration tests.

shaneknapp commented 6 years ago

i have a worker set up w/minikube and kvm that seems to "work as intended". i'd really like to start testing changes like this against that worker as soon as we can.

shaneknapp commented 6 years ago

btw, the --vm-driver=none option is a show-stopper for the amplab/riselab/open source spark build system. it requires that the UID have root privs to many parts of the worker's system and is a potentially large security hole.