gnocchixyz / gnocchi

Timeseries database
Apache License 2.0
299 stars 85 forks source link

Gnocchi metrics aggregation missing granularity exception #148

Closed mariusmucenicu closed 7 years ago

mariusmucenicu commented 7 years ago

Hello,

We use openstack and gnocchi service for our project and we got an usual behavior while trying to aggregate metrics for "instance_network_interface".

I've traced the exception to its roots and if a granularity is not present when trying to do the aggregation an exception will be thrown, but it doesn't fail gracefully, it fails with a html 500 response (I'm guessing this is the standard exception when pecan can't handle something)

I'll get right to it: I'm looping through some network metrics as follows (this being just a sample of the code)

metrics = (('network.incoming.bytes.rate', 'bytes_in'),
           ('network.incoming.packets.rate', 'packets_in'),
           ('network.outgoing.bytes.rate', 'bytes_out'),
           ('network.outgoing.packets.rate', 'packets_out'))

for metric in metrics:
    response[metric[1]] = self.gnocchi_admin.metric.aggregation(metrics=metric[0],
                                                                granularity=granularity,
                                                                start=period_start,
                                                                stop=period_end,
                                                                reaggregation='mean',
                                                                resource_type=resource_type,
                                                                query={'=': {'instance_id': resource_id}})

gnocchi_admin is the python gnocchi client which uses the gnocchiclient.client.metric.aggregation which in turn does a request post under v1/aggregation/resource/instance_network_interface/metric/ with some query params (granularity, reaggregation, start, end date)

However this fails with a html 500 response as follows

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>500 Internal Server Error</title>
</head><body>
<h1>Internal Server Error</h1>
<p>The server encountered an internal error or
misconfiguration and was unable to complete
your request.</p>
<p>Please contact the server administrator at 
 [no address given] to inform them of the time this error occurred,
 and the actions you performed just before this error.</p>
<p>More information about this error may be available
in the server error log.</p>
</body></html>
 (HTTP 500)

I connected to the gnocchi server and traced the error which fails with the following traceback

2017-06-27 16:04:15.381016 mod_wsgi (pid=11079): Exception occurred processing WSGI script '/var/www/cgi-bin/gnocchi/gnocchi-api'.
2017-06-27 16:04:15.381105 Traceback (most recent call last):
2017-06-27 16:04:15.381133   File "/openstack/venvs/gnocchi-15.1.4/lib/python2.7/site-packages/webob/dec.py", line 130, in __call__
2017-06-27 16:04:15.381693     resp = self.call_func(req, *args, **self.kwargs)
2017-06-27 16:04:15.381722   File "/openstack/venvs/gnocchi-15.1.4/lib/python2.7/site-packages/webob/dec.py", line 195, in call_func
2017-06-27 16:04:15.381945     return self.func(req, *args, **kwargs)
2017-06-27 16:04:15.381977   File "/openstack/venvs/gnocchi-15.1.4/lib/python2.7/site-packages/oslo_middleware/base.py", line 126, in __call__
2017-06-27 16:04:15.382352     response = req.get_response(self.application)
2017-06-27 16:04:15.382390   File "/openstack/venvs/gnocchi-15.1.4/lib/python2.7/site-packages/webob/request.py", line 1299, in send
2017-06-27 16:04:15.383169     application, catch_exc_info=False)
2017-06-27 16:04:15.383197   File "/openstack/venvs/gnocchi-15.1.4/lib/python2.7/site-packages/webob/request.py", line 1263, in call_application
2017-06-27 16:04:15.383230     app_iter = application(self.environ, start_response)
2017-06-27 16:04:15.383513   File "/openstack/venvs/gnocchi-15.1.4/lib/python2.7/site-packages/paste/urlmap.py", line 216, in __call__
2017-06-27 16:04:15.383972     return app(environ, start_response)
2017-06-27 16:04:15.383999   File "/openstack/venvs/gnocchi-15.1.4/lib/python2.7/site-packages/webob/dec.py", line 130, in __call__
2017-06-27 16:04:15.384248     resp = self.call_func(req, *args, **self.kwargs)
2017-06-27 16:04:15.384273   File "/openstack/venvs/gnocchi-15.1.4/lib/python2.7/site-packages/webob/dec.py", line 195, in call_func
2017-06-27 16:04:15.384632     return self.func(req, *args, **kwargs)
2017-06-27 16:04:15.384656   File "/openstack/venvs/gnocchi-15.1.4/lib/python2.7/site-packages/oslo_middleware/base.py", line 126, in __call__
2017-06-27 16:04:15.384878     response = req.get_response(self.application)
2017-06-27 16:04:15.384903   File "/openstack/venvs/gnocchi-15.1.4/lib/python2.7/site-packages/webob/request.py", line 1299, in send
2017-06-27 16:04:15.385068     application, catch_exc_info=False)
2017-06-27 16:04:15.385089   File "/openstack/venvs/gnocchi-15.1.4/lib/python2.7/site-packages/webob/request.py", line 1263, in call_application
2017-06-27 16:04:15.385282     app_iter = application(self.environ, start_response)
2017-06-27 16:04:15.385306   File "/openstack/venvs/gnocchi-15.1.4/lib/python2.7/site-packages/webob/dec.py", line 130, in __call__
2017-06-27 16:04:15.385324     resp = self.call_func(req, *args, **self.kwargs)
2017-06-27 16:04:15.385332   File "/openstack/venvs/gnocchi-15.1.4/lib/python2.7/site-packages/webob/dec.py", line 195, in call_func
2017-06-27 16:04:15.385345     return self.func(req, *args, **kwargs)
2017-06-27 16:04:15.385354   File "/openstack/venvs/gnocchi-15.1.4/lib/python2.7/site-packages/keystonemiddleware/auth_token/__init__.py", line 335, in __call__
2017-06-27 16:04:15.388294     response = req.get_response(self._app)
2017-06-27 16:04:15.388486   File "/openstack/venvs/gnocchi-15.1.4/lib/python2.7/site-packages/webob/request.py", line 1299, in send
2017-06-27 16:04:15.388690     application, catch_exc_info=False)
2017-06-27 16:04:15.388994   File "/openstack/venvs/gnocchi-15.1.4/lib/python2.7/site-packages/webob/request.py", line 1263, in call_application
2017-06-27 16:04:15.389161     app_iter = application(self.environ, start_response)
2017-06-27 16:04:15.389316   File "/openstack/venvs/gnocchi-15.1.4/lib/python2.7/site-packages/webob/exc.py", line 1169, in __call__
2017-06-27 16:04:15.390364     return self.application(environ, start_response)
2017-06-27 16:04:15.390844   File "/openstack/venvs/gnocchi-15.1.4/lib/python2.7/site-packages/gnocchi/rest/app.py", line 68, in __call__
2017-06-27 16:04:15.391385     return self.app(environ, start_response)
2017-06-27 16:04:15.391548   File "/openstack/venvs/gnocchi-15.1.4/lib/python2.7/site-packages/pecan/middleware/recursive.py", line 56, in __call__
2017-06-27 16:04:15.391842     return self.application(environ, start_response)
2017-06-27 16:04:15.392120   File "/openstack/venvs/gnocchi-15.1.4/lib/python2.7/site-packages/pecan/core.py", line 840, in __call__
2017-06-27 16:04:15.392606     return super(Pecan, self).__call__(environ, start_response)
2017-06-27 16:04:15.392744   File "/openstack/venvs/gnocchi-15.1.4/lib/python2.7/site-packages/pecan/core.py", line 683, in __call__
2017-06-27 16:04:15.392924     self.invoke_controller(controller, args, kwargs, state)
2017-06-27 16:04:15.393217   File "/openstack/venvs/gnocchi-15.1.4/lib/python2.7/site-packages/pecan/core.py", line 574, in invoke_controller
2017-06-27 16:04:15.393364     result = controller(*args, **kwargs)
2017-06-27 16:04:15.393511   File "/openstack/venvs/gnocchi-15.1.4/lib/python2.7/site-packages/gnocchi/rest/__init__.py", line 1505, in post
2017-06-27 16:04:15.394345     granularity, needed_overlap, fill, refresh, resample)
2017-06-27 16:04:15.394507   File "/openstack/venvs/gnocchi-15.1.4/lib/python2.7/site-packages/gnocchi/rest/__init__.py", line 1614, in get_cross_metric_measures_from_objs
2017-06-27 16:04:15.394683     granularity, resample)
2017-06-27 16:04:15.394806   File "/openstack/venvs/gnocchi-15.1.4/lib/python2.7/site-packages/gnocchi/storage/_carbonara.py", line 156, in get_measures
2017-06-27 16:04:15.397060     from_timestamp, to_timestamp)
2017-06-27 16:04:15.397473   File "/openstack/venvs/gnocchi-15.1.4/lib/python2.7/site-packages/gnocchi/storage/_carbonara.py", line 187, in _get_measures_timeserie
2017-06-27 16:04:15.397668     raise storage.GranularityDoesNotExist(metric, granularity)
2017-06-27 16:04:15.397829 GranularityDoesNotExist: Granularity '1800.0' for metric 615e740b-d897-4f69-9d60-c5b48c1b6293 does not exist

So the next thing was to follow the traceback and I've made it to the point where it fails: https://github.com/gnocchixyz/gnocchi/blob/master/gnocchi/rest/__init__.py#L1653

This was for my case but it could fail anywhere in the try except block if the granularity is not found

So evidently I've added an

except GranularityDoesNotExist as e:
    abort(404, e)

at the bottom and everything worked fine (failed gracefully), which in turn helped me catch exceptions explicitly etc.

Note that we know what the problem was in the first place (we didn't have a granularity defined in the ARCHIVE POLICY, and have an ARCHIVE POLICY RULE point to our metrics) and we could have avoided the exception doing that, but there are a lot of configurations out there, and I'm guessing this is an issue.

Thanks in advance

jd commented 7 years ago

Can you precise which version of Gnocchi you're running?

mariusmucenicu commented 7 years ago

Sure, gnocchi==3.1.4 gnocchiclient==3.0.0

jd commented 7 years ago

Thanks @mariusmucenicu. I wonder if it's not a duplicate of https://github.com/gnocchixyz/gnocchi/issues/69, could you upgrade to 3.1.6 to check that this is not already fixed?

Edit: nevermind, I don't think it is. I misread the bug :)

jd commented 7 years ago

This has been fixed and backported!