Closed mccheah closed 7 years ago
Thank you for this 👍
@varunkatta I changed some things in the scheduler backend related to #244. Some of them are style things and variable renames, but there was a case I identified where we didn't call removeExecutor
when I think we should. The unit tests reflect what I thought should be the expected behavior. It would be much appreciated if you could take a look at the changes and the tests to verify whether or not we're doing the right thing here.
Integration test failure is legit from #454.
This is ready for review, but is not strictly complete. I'm certain that some corner cases are missing in the scheduler backend tests. As well, we don't yet unit test the components that were factored out in #454 and #452. I'll probably revisit these but I would like to keep these PRs small so that we can incrementally follow the cases that we've been covering in these tests.
Assigning @varunkatta and @foxish to verify correctness of the nuanced but important logic changes done here.
Also, I think this should probably go in before #392 . I can merge changes from this PR into #392 once this PR is accepted.
Rebased
rerun unit tests please
K reverted it -- will merge when builds are green!
It's merged -- was there more you thought should be done here?
@ash211 Sorry,i have just seen it,will it be closed soon? i just think There may be some unknown problems in #244 as it still lacks large-scale use. Anyway,i will test this function in practical use and report problems.
I've seen some problems myself with executor recovery, but haven't dug into why yet. Please do open new issues with any observations you see of bad behavior!
Requires #454, which in turn requires #452.