juju / theblues

Python library for the juju charmstore (v4)
GNU Lesser General Public License v3.0
5 stars 22 forks source link

theblues.errors.ServerError: Request timed out: https://api.jujucharms.com/v4...timeout: 3.05 #49

Open ryan-beisner opened 7 years ago

ryan-beisner commented 7 years ago

We have been seeing sustained and noticeable charm store api timeouts in OpenStack Charm CI. This causes false test failures in our charm dev/test gate.

Can this be made more resilient and/or configurable, such as retry threshold and timeout backoffs?

DEBUG:runner:Traceback (most recent call last):
DEBUG:runner:  File "/tmp/bundletester-MyqTGN/ceph-osd/tests/gate-basic-xenial-mitaka", line 22, in <module>
DEBUG:runner:    deployment = CephOsdBasicDeployment(series='xenial')
DEBUG:runner:  File "/tmp/bundletester-MyqTGN/ceph-osd/tests/basic_deployment.py", line 40, in __init__
DEBUG:runner:    self._add_services()
DEBUG:runner:  File "/tmp/bundletester-MyqTGN/ceph-osd/tests/basic_deployment.py", line 72, in _add_services
DEBUG:runner:    other_services)
DEBUG:runner:  File "/tmp/bundletester-MyqTGN/ceph-osd/tests/charmhelpers/contrib/openstack/amulet/deployment.py", line 147, in _add_services
DEBUG:runner:    other_services)
DEBUG:runner:  File "/tmp/bundletester-MyqTGN/ceph-osd/tests/charmhelpers/contrib/amulet/deployment.py", line 67, in _add_services
DEBUG:runner:    constraints=svc.get('constraints'))
DEBUG:runner:  File "/var/lib/jenkins/checkout/0/ceph-osd/.tox/func27-smoke/local/lib/python2.7/site-packages/amulet/deployer.py", line 208, in add
DEBUG:runner:    service_name, charm, branch=branch, series=service['series'])
DEBUG:runner:  File "/var/lib/jenkins/checkout/0/ceph-osd/.tox/func27-smoke/local/lib/python2.7/site-packages/amulet/charm.py", line 57, in fetch
DEBUG:runner:    series=series)
DEBUG:runner:  File "/var/lib/jenkins/checkout/0/ceph-osd/.tox/func27-smoke/local/lib/python2.7/site-packages/amulet/charm.py", line 42, in get_charm
DEBUG:runner:    return Charm(with_series(charm_path, series))
DEBUG:runner:  File "/var/lib/jenkins/checkout/0/ceph-osd/.tox/func27-smoke/local/lib/python2.7/site-packages/charmstore/lib.py", line 155, in __init__
DEBUG:runner:    super(Charm, self).__init__(id, api)
DEBUG:runner:  File "/var/lib/jenkins/checkout/0/ceph-osd/.tox/func27-smoke/local/lib/python2.7/site-packages/charmstore/lib.py", line 105, in __init__
DEBUG:runner:    AVAILABLE_INCLUDES).get('Meta')
DEBUG:runner:  File "/var/lib/jenkins/checkout/0/ceph-osd/.tox/func27-smoke/local/lib/python2.7/site-packages/theblues/charmstore.py", line 107, in _meta
DEBUG:runner:    data = self._get(url)
DEBUG:runner:  File "/var/lib/jenkins/checkout/0/ceph-osd/.tox/func27-smoke/local/lib/python2.7/site-packages/theblues/charmstore.py", line 78, in _get
DEBUG:runner:    raise ServerError(message)

DEBUG:runner:theblues.errors.ServerError: Request timed out: https://api.jujucharms.com/v4/~openstack-charmers-next/xenial/ceph-mon/meta/any?include=bundle-machine-count&include=bundle-metadata&include=bundle-unit-count&include=bundles-containing&include=charm-actions&include=charm-config&include=charm-metadata&include=common-info&include=extra-info&include=revision-info&include=stats&include=supported-series&include=manifest&include=tags&include=promulgated&include=perm&include=id timeout: 3.05
jcsackett commented 7 years ago

Hey @ryan-beisner--

It looks like you're just using the default timeout for a charmstore object--you can pass in a timeout value when you create one--see https://github.com/juju/theblues/blob/develop/theblues/charmstore.py#L24.

Would that fix things for you?

ryan-beisner commented 7 years ago

Unfortunately, I don't have control over that lever, as theblues is instantiated by Amulet which is wrapped by Bundletester. Ref: traceback.

marcoceppi commented 7 years ago

What's a reasonable timeout if not 3 seconds?

On Thu, Nov 10, 2016 at 2:13 PM Ryan Beisner notifications@github.com wrote:

Unfortunately, I don't have control over that lever, as theblues is instantiated by Amulet which is wrapped by Bundletester. Ref: traceback.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/juju/theblues/issues/49#issuecomment-259779744, or mute the thread https://github.com/notifications/unsubscribe-auth/AAET1evyolsS8xs2fJZmgyZB93V2ZYJ-ks5q82yugaJpZM4Kt6ej .

ryan-beisner commented 7 years ago

@marcoceppi That depends on the Interwebs weather on any particular day, which makes this tricky.

Typically, to handle these sort of things, you'd want to have a low initial timeout (3s is probably ok), wrapped in a retry loop with a backoff (perhaps 3s, 6s, 12s, 24s) and a max_wait (1m). If it doesn't come through in 1m, it's probably never going to, and it should bail.