docker / libkv

Distributed key/value store abstraction library
Apache License 2.0
853 stars 205 forks source link

Zookeeper: Make Create+Put on a new znode atomic #148

Open amirma opened 7 years ago

amirma commented 7 years ago

Writes to a new Zookeepr znode should take advantage of Zookeeper's atomic create + write primitive. If not, it is possible that a read that was triggered by a watch will return an empty string. The previous workaround for this does not always work (e.g., when an empty string is returned due to a race) and can potentially cause call-stack overflow. This change-set fixes this race and removes the workaround. It also adds a call to Zookeepeer's Sync() on a Get operation, only when an empty string (or SOH) is returned to guard against an older version of libkv doing create+write in a non atomic fashion.

This change-set addresses github.com/docker/swarm/issues/1915

Signed-off-by: Amir Malekpour amir.malekpour@cohodata.com

dongluochen commented 7 years ago

I'm not familiar with this repo. CI is failing for last few PRs.

0.08s$ script/travis_consul.sh 0.6.3
--2017-02-24 01:32:49--  https://releases.hashicorp.com/consul/0.6.3/consul_0.6.3_linux_amd64.zip
Resolving releases.hashicorp.com (releases.hashicorp.com)... 151.101.33.183
Connecting to releases.hashicorp.com (releases.hashicorp.com)|151.101.33.183|:443... connected.
Unable to establish SSL connection.
dongluochen commented 7 years ago

Ping @docker/core-libkv-maintainers.

CI is also failing.

mcapuccini commented 7 years ago

@dongluochen my PR fixes it https://github.com/docker/libkv/pull/154

amirma commented 7 years ago

@nishanttotla We've merged this in our production repository and I can confirm it has fixed the issue. I think the patch is ready to merge as is.

nishanttotla commented 7 years ago

@amirma thanks for the prompt reply! Seems like your build is failing. Could you take a look at that? It would be nice to merge this.

amirma commented 7 years ago

@nishanttotla I just pushed another build and it failed due to a new (I believe infrastructure-related) reason. Any insights?

nishanttotla commented 7 years ago

It seems like some package paths/locations may have changed:

27.62s$ go get -t -v ./...
github.com/stretchr/testify (download)
github.com/boltdb/bolt (download)
github.com/hashicorp/consul (download)
github.com/coreos/etcd (download)
github.com/coreos/go-semver (download)
github.com/ugorji/go (download)
Fetching https://golang.org/x/net/context?go-get=1
Parsing meta tags from https://golang.org/x/net/context?go-get=1 (status code 200)
get "golang.org/x/net/context": found meta tag main.metaImport{Prefix:"golang.org/x/net", VCS:"git", RepoRoot:"https://go.googlesource.com/net"} at https://golang.org/x/net/context?go-get=1
get "golang.org/x/net/context": verifying non-authoritative meta tag
Fetching https://golang.org/x/net?go-get=1
Parsing meta tags from https://golang.org/x/net?go-get=1 (status code 200)
golang.org/x/net (download)
github.com/samuel/go-zookeeper (download)
github.com/docker/libkv/store
github.com/boltdb/bolt
github.com/hashicorp/consul/vendor/github.com/hashicorp/go-cleanhttp
github.com/hashicorp/consul/vendor/github.com/hashicorp/serf/coordinate
github.com/coreos/etcd/pkg/pathutil
github.com/coreos/etcd/pkg/types
github.com/coreos/go-semver/semver
github.com/ugorji/go/codec
golang.org/x/net/context
github.com/docker/libkv
github.com/stretchr/testify/vendor/github.com/davecgh/go-spew/spew
github.com/stretchr/testify/vendor/github.com/pmezard/go-difflib/difflib
github.com/stretchr/testify/vendor/github.com/stretchr/objx
github.com/samuel/go-zookeeper/zk
github.com/coreos/etcd/version
github.com/hashicorp/consul/api
github.com/stretchr/testify/assert
github.com/docker/libkv/store/boltdb
github.com/docker/libkv/store/zookeeper
github.com/stretchr/testify/mock
github.com/docker/libkv/testutils
github.com/docker/libkv/store/mock
github.com/docker/libkv/store/consul
github.com/coreos/etcd/client
github.com/docker/libkv/store/etcd
0.08s$ script/travis_consul.sh 0.6.3
--2017-02-24 01:32:49--  https://releases.hashicorp.com/consul/0.6.3/consul_0.6.3_linux_amd64.zip
Resolving releases.hashicorp.com (releases.hashicorp.com)... 151.101.33.183
Connecting to releases.hashicorp.com (releases.hashicorp.com)|151.101.33.183|:443... connected.
Unable to establish SSL connection.
unzip:  cannot find or open consul_0.6.3_linux_amd64.zip, consul_0.6.3_linux_amd64.zip.zip or consul_0.6.3_linux_amd64.zip.ZIP.
script/travis_consul.sh: line 18: ./consul: No such file or directory

Ping @docker/core-libkv-maintainers, the CI might need fixing.

abronan commented 7 years ago

The fix is included to my fork: abronan/libkv along with other patches, thanks @amirma.

nishanttotla commented 7 years ago

@abronan should we merge this PR, or might it need more changes? I think this is useful and fixes a long standing issue.

abronan commented 7 years ago

@nishanttotla I think it is safe to merge it yes, just merge the other PRs that are fixing the build first (the commit history on my fork could be of help).