Open ChrisMacNaughton opened 6 years ago
There are some speed improvements that need to be done to the charmstore but in the meantime you can start by not requesting so much information in that request.
To deploy a charm you shouldn't need a number of those pieces of data. Especially things like stats
and bundles-containing
both of which are slow requests and not required to deploy. You can also try increasing the timeout from 5s to 10s.
I still get "0 items" pretty frequently at https://jujucharms.com/u/openstack-charmers-next - could this be related?
Adjusting to 30 seconds for this query also is timing out:
theblues.errors.ServerError: Request timed out: https://api.jujucharms.com/charmstore/v5/~openstack-charmers-next/bionic/neutron-gateway/meta/any?include=bundle-machine-count&include=bundle-metadata&include=bundle-unit-count&include=bundles-containing&include=charm-actions&include=charm-config&include=charm-metadata&include=common-info&include=extra-info&include=revision-info&include=stats&include=supported-series&include=manifest&include=tags&include=promulgated&include=perm&include=id timeout: 30.0
It's breaking tooling all over. What is the expected time that this query should take, please?
More timeouts
Traceback (most recent call last):
File "/snap/conjure-up/1021/lib/python3.6/site-packages/theblues/charmstore.py", line 57, in _get
timeout=self.timeout)
File "/snap/conjure-up/1021/lib/python3.6/site-packages/requests/api.py", line 72, in get
return request('get', url, params=params, **kwargs)
File "/snap/conjure-up/1021/lib/python3.6/site-packages/requests/api.py", line 58, in request
return session.request(method=method, url=url, **kwargs)
File "/snap/conjure-up/1021/lib/python3.6/site-packages/requests/sessions.py", line 508, in request
resp = self.send(prep, **send_kwargs)
File "/snap/conjure-up/1021/lib/python3.6/site-packages/raven/breadcrumbs.py", line 326, in send
resp = real_send(self, request, *args, **kwargs)
File "/snap/conjure-up/1021/lib/python3.6/site-packages/requests/sessions.py", line 618, in send
r = adapter.send(request, **kwargs)
File "/snap/conjure-up/1021/lib/python3.6/site-packages/requests/adapters.py", line 521, in send
raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='api.jujucharms.com', port=443): Read timed out. (read timeout=5)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/snap/conjure-up/1021/lib/python3.6/site-packages/conjureup/controllers/juju/deploy/common.py", line 30, in do_deploy
await app.juju.client.deploy(fn)
File "/snap/conjure-up/1021/lib/python3.6/site-packages/juju/model.py", line 1281, in deploy
await handler.execute_plan()
File "/snap/conjure-up/1021/lib/python3.6/site-packages/juju/model.py", line 1925, in execute_plan
result = await method(*step.args)
File "/snap/conjure-up/1021/lib/python3.6/site-packages/juju/model.py", line 2067, in deploy
charm)
File "/snap/conjure-up/1021/lib/python3.6/site-packages/juju/model.py", line 1338, in _add_store_resources
entity = await self.charmstore.entity(entity_url)
File "/snap/conjure-up/1021/lib/python3.6/site-packages/juju/model.py", line 2161, in coro
return await self.loop.run_in_executor(None, method)
File "/snap/conjure-up/1021/usr/lib/python3.6/concurrent/futures/thread.py", line 56, in run
result = self.fn(*self.args, **self.kwargs)
File "/snap/conjure-up/1021/lib/python3.6/site-packages/theblues/charmstore.py", line 136, in entity
return self._meta(entity_id, includes, channel=channel)
File "/snap/conjure-up/1021/lib/python3.6/site-packages/theblues/charmstore.py", line 107, in _meta
data = self._get(url)
File "/snap/conjure-up/1021/lib/python3.6/site-packages/theblues/charmstore.py", line 78, in _get
raise ServerError(message)
theblues.errors.ServerError: Request timed out: https://api.jujucharms.com/charmstore/v5/glance-268/meta/any?include=bundle-machine-count&include=bundle-metadata&include=bundle-unit-count&include=bundles-containing&include=charm-config&include=charm-metadata&include=common-info&include=extra-info&include=owner&include=revision-info&include=published&include=stats&include=resources&include=supported-series&include=terms timeout: 5
The CS: timeout issues are prevalent in our CI as well. So to summarize: end users browsing jujucharms.com have a frequent error condition (0 items shown), conjure-up has a high failure rate for end users, my developers are often blocked merges which cannot complete a CI test. Please investigate the back-end of CS.
At present I am greeted with this error on charm search queries like keystone
and ceph
:
I'm also greeted with this error:
Usually, refreshing solves the problem.
The culprit largely seems to be the stats component of queries on the charmstore. If possible omitting stats from the include in query params can speed things up considerably.
@fnordahl @ajkavanagh -- we are working on a deploy of jujucharms.com that should fix searching through the site, using the stats omission workaround.
@ajkavanagh @ryan-beisner @ChrisMacNaughton is the stats data necessary for your current uses? If not I would recommend removing it.
Charmstore folks are working on the issue as well, but in the meantime this approach could mitigate the problems.
@jaycee That library has been in place as-is, for better or for worse, for quite some time. What has changed that suddenly causes this to be an issue?
We can, should and will adjust the theblues library. However, it's a messy sprawl of impact. In order to actually profit, conjure-up and other projects will need to be rebuilt and redistributed with the modified theblues project. Also, other consumers of the theblues library will need to be modified to turn off stats, then and those modified projects will need to be redistributed/released.
FTR, bundletester, amulet also affected:
(func27-noop) ubuntu@beisner-bastion:~/git/lxd$ tests/gate-basic-bionic-queens
2018-09-18 10:15:12,876 __init__ INFO: OpenStackAmuletDeployment: init
2018-09-18 10:15:12,876 _add_services INFO: OpenStackAmuletDeployment: adding services
2018-09-18 10:15:12,876 _determine_branch_locations INFO: OpenStackAmuletDeployment: determine branch locations
ERROR:root:Request timed out: https://api.jujucharms.com/charmstore/v5/~openstack-charmers-next/bionic/nova-cloud-controller/meta/any?include=bundle-machine-count&include=bundle-metadata&include=bundle-unit-count&include=bundles-containing&include=charm-actions&include=charm-config&include=charm-metadata&include=common-info&include=extra-info&include=revision-info&include=stats&include=supported-series&include=manifest&include=tags&include=promulgated&include=perm&include=id timeout: 10.0
Traceback (most recent call last):
File "tests/gate-basic-bionic-queens", line 24, in <module>
stable=False)
File "/home/ubuntu/git/lxd/tests/basic_deployment.py", line 57, in __init__
self._add_services()
File "/home/ubuntu/git/lxd/tests/basic_deployment.py", line 95, in _add_services
other_services)
File "/home/ubuntu/git/lxd/tests/charmhelpers/contrib/openstack/amulet/deployment.py", line 151, in _add_services
other_services)
File "/home/ubuntu/git/lxd/tests/charmhelpers/contrib/amulet/deployment.py", line 67, in _add_services
constraints=svc.get('constraints'))
File "/home/ubuntu/git/lxd/.tox/func27-noop/local/lib/python2.7/site-packages/amulet/deployer.py", line 218, in add
service_name, charm, branch=branch, series=service['series'])
File "/home/ubuntu/git/lxd/.tox/func27-noop/local/lib/python2.7/site-packages/amulet/charm.py", line 61, in fetch
series=series)
File "/home/ubuntu/git/lxd/.tox/func27-noop/local/lib/python2.7/site-packages/amulet/charm.py", line 46, in get_charm
return Charm(with_series(charm_path, series))
File "/home/ubuntu/git/lxd/.tox/func27-noop/local/lib/python2.7/site-packages/charmstore/lib.py", line 168, in __init__
super(Charm, self).__init__(id, api=api, timeout=timeout)
File "/home/ubuntu/git/lxd/.tox/func27-noop/local/lib/python2.7/site-packages/charmstore/lib.py", line 115, in __init__
AVAILABLE_INCLUDES).get('Meta')
File "/home/ubuntu/git/lxd/.tox/func27-noop/local/lib/python2.7/site-packages/theblues/charmstore.py", line 107, in _meta
data = self._get(url)
File "/home/ubuntu/git/lxd/.tox/func27-noop/local/lib/python2.7/site-packages/theblues/charmstore.py", line 78, in _get
raise ServerError(message)
theblues.errors.ServerError: Request timed out: https://api.jujucharms.com/charmstore/v5/~openstack-charmers-next/bionic/nova-cloud-controller/meta/any?include=bundle-machine-count&include=bundle-metadata&include=bundle-unit-count&include=bundles-containing&include=charm-actions&include=charm-config&include=charm-metadata&include=common-info&include=extra-info&include=revision-info&include=stats&include=supported-series&include=manifest&include=tags&include=promulgated&include=perm&include=id timeout: 10.0
@ryan-beisner The stats component of the charmstore has always been slow; however, there's presently some issue going on with the underlying tools charmstore relies on causing a general slow down, which exacerbates the stats to the point that we're seeing these slowdowns.
theblues has been updated w/ @ChrisMacNaughton's fix: https://pypi.org/project/theblues/
Additionally, the fix is now live on jujucharms.com
@jaycee Even though theblues is released with the update, the default behavior of it was not changed. That means that we will have to go find all of the calls to theblues in intermediate modules such as amulet, bundletester, and other tools, modify them to tell theblues not to ask for stats, then release those individual intermediate tools.
FTR, we are still seeing prevalent CS timeouts overnight 2018 Sept 18 to 19.
I'm aware that the change is going to need to be made--as I understand it fixes are already underway for python-jujulib and amulet.
The underlying issues on CS are to my knowledge still being worked on--it requires a sizeable amount of work. I would encourage you to make the updates to use theblues b/c the CS fixes may take some time to rollout.
Indeed, we have all of that in flight. Just commenting here so passers-by understand that the theblues changes alone are not going to unblock devs and ci.
Ah, ok, I'm with you now. Thanks!
I'm still hitting >10s response times. You can see it in a openstack review https://openstack-ci-reports.ubuntu.com/artifacts/test_charm_pipeline_amulet_full/openstack/charm-aodh/603318/7/2056/test_charm_amulet_full_4171/amulet-full.txt . Also a curl command takes >10s:
$ time curl -H 'Cache-Control: no-cache' 'https://api.jujucharms.com/charmstore/v5/~openstack-charmers-next/bionic/ceph-mon/meta/any?include=bundle-machine-count&include=bundle-metadata&include=bundle-unit-count&include=bundles-containing&include=charm-actions&include=charm-config&include=charm-metadata&include=common-info&include=extra-info&include=revision-info&include=supported-series&include=manifest&include=tags&include=promulgated&include=perm' -D- | grep -v ceph
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- 0:00:10 --:--:-- 0HTTP/1.1 200 OK
Date: Thu, 20 Sep 2018 05:56:20 GMT
Server: Apache/2.4.7 (Ubuntu)
Access-Control-Allow-Credentials: true
Access-Control-Allow-Headers: Bakery-Protocol-Version, Macaroons, X-Requested-With
0Access-Control-Allow-Methods: DELETE,GET,HEAD,PUT,POST,OPTIONS
Access-Control-Allow-Origin: *
Access-Control-Cache-Max-Age: 600
Access-Control-Expose-Headers: WWW-Authenticate
Content-Type: application/json
X-Cache: MISS from juju-ps45-cdo-jujucharms-machine-27
0X-Cache-Lookup: MISS from juju-ps45-cdo-jujucharms-machine-27:3128
Via: 1.1 juju-ps45-cdo-jujucharms-machine-27 (squid/3.3.8)
Vary: Accept-Encoding
Transfer-Encoding: chunked
100 97763 0 97763 0 0 8828 0 --:--:-- 0:00:11 --:--:-- 26865
real 0m11.086s
user 0m0.021s
sys 0m0.012s```
fwiw subsequent hit a cache and are <2s.
@jaycee ^ that is with stats disabled. We are still hitting prevalent CS timeouts. In an openstack deploy, with ~40 charms, all it takes is 1 charm query to fail to spoil an expensive and lengthy CI run. The result is that our devs cannot get code through our gate to land code currently.
Hi All, the charm store is still timing out on the web UI, reporting 0 items fairly often. What is the status of the solution for this?
Conjure up regularly fails with a timeout fetching charm metadata during a deploy. This makes it impossible to use conjure-up to deploy any solution as the charms are unavailable.