Closed pferrot closed 8 years ago
the zk.get would fail as well if the parent does not exist, plus it will leave the watch on.. how do you reproduce this bug ?
From the behavior, it does not seem that the watch is on when zk.get makes a NoNodeError (actually it would be weird if it were on since an exception is thrown, would it not?).
I can reproduce by adding pods that should trigger a notification to the watcher pod after the watcher pod is deployed. But it is not constantly reproducible since it is a race condition issue. I can pretty often reproduce though.
ok - give me a scenario to test against in the CLI. I'll look at it today whenever i can.
Well, the scenario is:
Again: since this is a race condition issue, the test may succeed from time to time.
I fixed the issue in my fork and it seems to work as expected every time now, see https://github.com/pferrot/ochopod/blob/715d271d2a549e033a7d1f32c7e06a1eb890c0c4/ochopod/watchers/remote.py#L107-L110
ok - fix mirroring you branch coming within the next 6 hours .. i'm totally swamped today :(
aaah no i get it - this is because we reset the flip-flop .. the normal case does not require that extra call to exists() because if it fails (snapshot not there yet) the self.query flip-flop will be left to true..
man i told you i'm getting too old to code..
pod 1.0.7 updated
I am afraid that there is a bug with wildcard dependencies.
In my pull request, there was special handling in case the parent node exists but the 'snapshot' node does not exist yet, see https://github.com/autodesk-cloud/ochopod/pull/32/files and those few lines in particular:
It seems that this logic has been dropped when the pull request has been integrated. As a result, it can happen that a pod fails to be notified when a "wildcard dependency" pod joins.