docker / libkv

Distributed key/value store abstraction library
Apache License 2.0
853 stars 205 forks source link

Zookeeper missed watched events #157

Open convergentio opened 7 years ago

convergentio commented 7 years ago

There were a few issues that I spotted with respect to ZK watch handling in the libkv library.

1) When setting a watch the library calls GetW() on the zookeeper client, but it didn't handle the race condition that Amir Malekpour fixed previously by sync'ing and retrying.

2) In the implementation of Store.Watch(), which watches for changes in a specific key, clients could miss changes in the value of interest due to the following pattern in the implementation:

  1. get value of key
  2. send value of key to client on channel
  3. get value of key and set watch, but ignore value of key
  4. when watch fires, get value of key and send value of key to client on channel

The above has been reduced to:

  1. get value of key and set watch, send value of key to client on channel

3) In the implementation of Store.WatchTree(), which watches for changes in the children of specific key, clients could miss events due to the following pattern in the implementation:

  1. get children of the key
  2. send child list to the client over the channel
  3. get children and set watch, but ignore the set of children
  4. when watch fires, get children of key and send child list to the client over the channel

Step 4 was problematic because any failure to get the children after the event was fired would result in clients missing the change to the set of children until the watch fired again due to a subsequent change.

The implementation has been reduced to:

  1. get children and set watch, send children to client on channel and retry immediately if values of child nodes could not be read
convergentio commented 7 years ago

The CI failure is fixed by #154