gnocchixyz / gnocchi

Timeseries database
Apache License 2.0
299 stars 85 forks source link

Cannot process new measures if using Ceph cache. #1051

Closed AKS74n closed 4 years ago

AKS74n commented 4 years ago

Before reporting an issue on Gnocchi, please be sure to provide all necessary information.

Which version of Gnocchi are you using

4.3.2

How to reproduce your problem

Re-join the cache tier into gnocchi pool.

What is the result that you get

Got error in gnocchi-metricd.log and no measure data in any metrics.

What is result that you expected

No error.

Hi, We're using gnocchi 4.3.2 with Ceph Luminous (12.2.2). (The whole platform is Openstack Rocky)

We found that gnocchi-metricd keep raise below error:

2019-11-19 05:34:33,852 [66] ERROR gnocchi.chef: Error processing new measures Traceback (most recent call last): File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/gnocchi/chef.py", line 165, in process_new_measures_for_sack for metric in metrics File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/gnocchi/storage/init.py", line 714, in add_measures_to_metrics before_truncate_callback=_map_compute_splits_operations, File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/gnocchi/carbonara.py", line 364, in set_values return_value = before_truncate_callback(self) File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/gnocchi/storage/init.py", line 702, in _map_compute_splits_operations new_first_block_timestamp) File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/gnocchi/storage/init.py", line 522, in _compute_split_operations {metric: aggregations_needing_list_of_keys})[metric] File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/gnocchi/storage/init.py", line 268, in _list_split_keys for metric in metrics)) File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/gnocchi/utils.py", line 316, in parallel_map return list(executor.map(lambda args: fn(args), list_of_args)) File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/concurrent/futures/_base.py", line 641, in result_iterator yield fs.pop().result() File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/concurrent/futures/_base.py", line 462, in result return self.__get_result() File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/concurrent/futures/thread.py", line 63, in run result = self.fn(self.args, *self.kwargs) File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/gnocchi/utils.py", line 316, in return list(executor.map(lambda args: fn(args), list_of_args)) File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/gnocchi/storage/ceph.py", line 168, in _list_split_keys_unbatched raise storage.MetricDoesNotExist(metric) MetricDoesNotExist: Metric 45efe421-103c-4a8c-a144-f314d2c0771b does not exist

And found that the gnocchi-cache pool only got 14 Bytes used inside, and no data inside the main gnocchi pool. All metrics are not working, showing the same error on each metrics in log. After few investigation, we found that it caused by Ceph cache tier. Because it works fine if using local store and other platform that not using cache tier on Ceph. We tried remove the cache tier on gnocchi pool and do gnocchi-upgrade again. Everything back normal again.

Is Gnocchi support Ceph cache tier? If yes, how can I setup the cache tier for gnocchi pool?

Many thanks, Eddie.

jd commented 4 years ago

I've no clue what this is and I never heard of it. I'm pretty sure nobody ever tested it so I'm not surprised it does not work. :)

AKS74n commented 4 years ago

I've no clue what this is and I never heard of it. I'm pretty sure nobody ever tested it so I'm not surprised it does not work. :)

Thanks for your reply! So it's better to avoid using Ceph cache tier on gnocchi pool before find the root cause.

Perhaps having a record on here. For people who get the same issue as me if they're trying to use cache tier on gnocchi pool in Ceph.