Closed snoyberg closed 5 years ago
Interesting. I'll try to reproduce and trace the failures. There could be a timing issue here in the tests or it could be a bug, I'll spend some time going over the code.
It's worth remembering the motive at the top through - the branch tests rely on message ordering that can't be guaranteed. I'd forgotten about that bit. I should probably try to find a way to rewrite the tests, perhaps using the management API which should see the process deaths in the correct order iirc. I'll look into that.
Right, I see the problem and it's with the tests. The test code for most of the branch restarts uses a pre-configured supervisor which is started thus: sup <- Supervisor.start rs ParallelShutdown cs
. As you might imagine, the ParallelShutdown
option means we don't wait for the children to die in any particular order in the supervisor implementation. There are also tests that appear to expect a certain ordering which isn't guaranteed. I'm not going to have time to rewrite all of them today or tomorrow, but I am going to attempt to get it done this week.
Hmn, I'm now convinced that the tests are not written properly. The restartAllWithLeftToRightRestarts
is a prime example. It makes timing assumptions that just don't hold given the semantics. I think all the test cases in the suite that cover branch restarts need to be rewritten, which I'm making a start on now...
Sorry about the delay @snoyberg, it's taking me a while to find time to rewrite the tests as I'm dealing with a number of other PRs and tickets. I have not forgotten about this though!
@snoyberg - I've significantly refactored the tests, so as to remove the potential for races, using the distributed-process tracing layer to capture the order of events in disparate green threads instead of relying on the timing of monitor signals arriving (which isn't guaranteed by the runtime with respect to ordering). Hopefully supervisor will no longer crash, so I'm going to relax the upper bounds on dependencies a bit, and submit a PR to get it integrated back into stackage.