Closed lap1817 closed 6 years ago
@juchem , I added the logic to bubble up the connection errors when they happened too many. PTAL
@igor47 , PTAL. Rspec are updated to address your previous comments on using context
@igor47 , addressed the comments. please take a look
@igor47 , PTAL
the change and accompanying specs look much better, thanks
This PR is to follow the similar idea in the PR on Synapse (https://github.com/airbnb/synapse/pull/250).
The current logic in Nerve is that: if an error is raised from the reporter (e.g. zookeeper reporter) when ping?, report_up, report_down, the watcher thread will stop the reporter (so it tries to remove the znode). Then the Nerve main thread will reap the watcher and relaunch it.
This caused unnecessary Nerve status flipping while in short network disconnection. The ping? function on reporter is most frequently called by the watcher thread. E.g.,
Flipping nerve status triggers more Synapse read, which make the network saturation more likely to happen.
Changes
Test
Rspec added/updated Tested on mango-test with simulated packet loss
Reviewers: @igor47 @jolynch @juchem @ken-experiment