dims / etcd3-gateway

This repository is now read-only. Please see https://opendev.org/openstack/etcd3gw for the new location for this code.
https://opendev.org/openstack/etcd3gw
Apache License 2.0
10 stars 20 forks source link

Fetching lease fails with TTL #1

Open hemna opened 7 years ago

hemna commented 7 years ago

I'm using etc3dgw in openstack cinder and I'm running into an issue where the lease refresh is failing, due to a missing field.

stack@walt-stack-1  /opt/stack/logs/screen  pip freeze |grep etcd  1 ↵  482  12:23:14 etcd3==0.6.2 etcd3gw==0.1.0

---excerpt from the cinder log

2017-08-23 12:25:16.771 24793 ERROR root [-] Unexpected exception occurred 60 time(s)... retrying.: KeyError: 'TTL' 2017-08-23 12:25:16.771 24793 ERROR root Traceback (most recent call last): 2017-08-23 12:25:16.771 24793 ERROR root File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 250, in wrapper 2017-08-23 12:25:16.771 24793 ERROR root return infunc(*args, *kwargs) 2017-08-23 12:25:16.771 24793 ERROR root File "/usr/lib/python2.7/site-packages/tooz/coordination.py", line 187, in _beat_forever_until_stopped 2017-08-23 12:25:16.771 24793 ERROR root wait_until_next_beat = self._driver.heartbeat() 2017-08-23 12:25:16.771 24793 ERROR root File "/usr/lib/python2.7/site-packages/tooz/drivers/etcd3gw.py", line 191, in heartbeat 2017-08-23 12:25:16.771 24793 ERROR root lock.heartbeat() 2017-08-23 12:25:16.771 24793 ERROR root File "/usr/lib/python2.7/site-packages/tooz/drivers/etcd3gw.py", line 38, in wrapper 2017-08-23 12:25:16.771 24793 ERROR root return func(args, **kwargs) 2017-08-23 12:25:16.771 24793 ERROR root File "/usr/lib/python2.7/site-packages/tooz/drivers/etcd3gw.py", line 154, in heartbeat 2017-08-23 12:25:16.771 24793 ERROR root self._lease.refresh() 2017-08-23 12:25:16.771 24793 ERROR root File "/usr/lib/python2.7/site-packages/etcd3gw/lease.py", line 64, in refresh 2017-08-23 12:25:16.771 24793 ERROR root return int(result['result']['TTL']) 2017-08-23 12:25:16.771 24793 ERROR root KeyError: 'TTL' 2017-08-23 12:25:16.771 24793 ERROR root

dims commented 7 years ago

@hemna in the

hemna commented 7 years ago

I'm running a single devstack deployment on 2 separate nodes to try and test out Cinder's active/active HA capability with the 3PAR cinder driver. I run into this while running some rally scenarios.

stack@walt-stack-2  /opt/stack/bin  ./etcd --version etcd Version: 3.1.7 Git SHA: 43b7507 Go Version: go1.7.5 Go OS/Arch: linux/amd64

dims commented 7 years ago

@hemna As its going to be hard to recreate, could you please log the response we get from self.client.post? https://github.com/dims/etcd3-gateway/blob/master/etcd3gw/lease.py#L62

jharbott commented 7 years ago

Seems to happen in gate sporadically, see http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%20%5C%22KeyError%3A%20'TTL'%5C%22

From the timestamps it might be related to etcd getting slow:

http://logs.openstack.org/93/506093/4/check/gate-tempest-dsvm-cells-ubuntu-xenial/407192b/logs/screen-c-vol.txt.gz?#_Sep_22_00_24_50_163847

http://logs.openstack.org/93/506093/4/check/gate-tempest-dsvm-cells-ubuntu-xenial/407192b/logs/screen-etcd.txt.gz#_Sep_22_00_24_50_031662

dims commented 7 years ago

Looks like they have a FAQ just for this scenario : https://coreos.com/etcd/docs/latest/faq.html#performance