apache-spark-on-k8s / spark

Apache Spark enhanced with native Kubernetes scheduler back-end: NOTE this repository is being ARCHIVED as all new development for the kubernetes scheduler back-end is now on https://github.com/apache/spark/
https://spark.apache.org/
Apache License 2.0
612 stars 118 forks source link

Unit Tests for KubernetesClusterSchedulerBackend #459

Closed mccheah closed 7 years ago

mccheah commented 7 years ago

Requires #454, which in turn requires #452.

ifilonenko commented 7 years ago

Thank you for this 👍

mccheah commented 7 years ago

@varunkatta I changed some things in the scheduler backend related to #244. Some of them are style things and variable renames, but there was a case I identified where we didn't call removeExecutor when I think we should. The unit tests reflect what I thought should be the expected behavior. It would be much appreciated if you could take a look at the changes and the tests to verify whether or not we're doing the right thing here.

mccheah commented 7 years ago

Integration test failure is legit from #454.

mccheah commented 7 years ago

This is ready for review, but is not strictly complete. I'm certain that some corner cases are missing in the scheduler backend tests. As well, we don't yet unit test the components that were factored out in #454 and #452. I'll probably revisit these but I would like to keep these PRs small so that we can incrementally follow the cases that we've been covering in these tests.

mccheah commented 7 years ago

Assigning @varunkatta and @foxish to verify correctness of the nuanced but important logic changes done here.

varunkatta commented 7 years ago

Also, I think this should probably go in before #392 . I can merge changes from this PR into #392 once this PR is accepted.

mccheah commented 7 years ago

Rebased

mccheah commented 7 years ago

rerun unit tests please

ash211 commented 7 years ago

K reverted it -- will merge when builds are green!

ash211 commented 7 years ago

It's merged -- was there more you thought should be done here?

duyanghao commented 7 years ago

@ash211 Sorry,i have just seen it,will it be closed soon? i just think There may be some unknown problems in #244 as it still lacks large-scale use. Anyway,i will test this function in practical use and report problems.

ash211 commented 7 years ago

I've seen some problems myself with executor recovery, but haven't dug into why yet. Please do open new issues with any observations you see of bad behavior!