gnocchixyz / gnocchi

Timeseries database
Apache License 2.0
299 stars 85 forks source link

Gnocchi database is not deleting old revisions #1182

Open tomm144 opened 2 years ago

tomm144 commented 2 years ago

Before reporting an issue on Gnocchi, please be sure to provide all necessary information.

Which version of Gnocchi are you using

Gnocchi 4.3.3

How to correctly clean up old and not deleted revisions? I found out revisions which belongs to deleted resources. As a consequence is that mysql table resource_history is growing constantly. Thanks.

tobias-urdin commented 2 years ago

I don't think there is any such logic yet, that could probably be implemented into the chef part [1]

[1] https://github.com/gnocchixyz/gnocchi/blob/master/gnocchi/chef.py

tobias-urdin commented 2 years ago

I tested this in a Gnocchi 4.3.6 setup and I could not reproduce it. I deleted the resource using the Gnocchi API (gnocchi resource delete <id>) and the three revisions in the resource_history table was correctly cascade deleted.

Can you reproduce the issue or did I misinterpret something?

tomm144 commented 2 years ago

Hi @tobias-urdin you understood my problem. I use gnocchi resource delete "ended_at<'-60 days'" to delete ended resources. But in the database (resource_history talbe) are still 1 year old data. Do I something wrong way?

tobias-urdin commented 2 years ago

Can you please post output of command with --debug, make sure to censor any sensitive data. I would like to see what the request looks like with that to trace further.

tomm144 commented 2 years ago

@tobias-urdin here is output with debug. This command returns empty body, assuming there aren't resources older than 50 days. -ntb:~$ gnocchi --debug resource search "ended_at<'-50 days'" REQ: curl -g -i -X GET https://cloud_url:5000/v3 -H "Accept: application/json" -H "User-Agent: gnocchi keystoneauth1/4.0.0 python-requests/2.22.0 CPython/3.8.10" Starting new HTTPS connection (1): cloud_url:5000 https://cloud_url:5000 "GET /v3 HTTP/1.1" 200 256 RESP: [200] Connection: close Content-Length: 256 Content-Security-Policy: default-src 'self' https: wss:; Content-Type: application/json Date: Wed, 14 Sep 2022 09:35:02 GMT Server: nginx/1.14.0 (Ubuntu) Vary: X-Auth-Token X-Content-Type-Options: nosniff X-Frame-Options: DENY X-XSS-Protection: 1; mode=block x-openstack-request-id: req-b3555fa4-a4ee-4261-afd5-8dd8f3430771 RESP BODY: {"version": {"status": "stable", "updated": "2019-01-22T00:00:00Z", "media-types": [{"base": "application/json", "type": "application/vnd.openstack.identity-v3+json"}], "id": "v3.12", "links": [{"href": "https://cloud_url/v3/", "rel": "self"}]}} GET call to https://cloud_url:5000/v3 used request id req-b3555fa4-a4ee-4261-afd5-8dd8f3430771 Making authentication request to https://cloud_url:5000/v3/auth/tokens Resetting dropped connection: cloud_url https://cloud_url:5000 "POST /v3/auth/tokens HTTP/1.1" 201 9487 REQ: curl -g -i -X POST https://cloud_url:8041/v1/search/resource/generic?filter=ended_at%3C%27-50%20days%27 -H "Accept: application/json, */*" -H "Content-Type: application/json" -H "User-Agent: gnocchi keystoneauth1/4.0.0 python-requests/2.22.0 CPython/3.8.10" -H "X-Auth-Token: " Starting new HTTPS connection (1): cloud_url:8041 https://cloud_url:8041 "POST /v1/search/resource/generic?filter=ended_at%3C%27-50%20days%27 HTTP/1.1" 200 2 RESP: [200] Connection: close Content-Length: 2 Content-Type: application/json Date: Wed, 14 Sep 2022 09:35:03 GMT Server: nginx/1.14.0 (Ubuntu) RESP BODY: []

tobias-urdin commented 2 years ago

So deleting a resource should mark it's metrics as status=delete and the gnocchi-metricd daemon should purge those metrics after a while (depending on your load). Resource history entries is tied to the resource_id field on metrics in the database.

tomm144 commented 2 years ago

Hi @tobias-urdin status oscillates between 0 - 1K | storage/number of metric having measures to process | 269
| storage/total number of measures to process | 324

I'm not familiar with mysql binary, hope i 'we looked for data right way (cli ran with --binary-as-hex param)

table resource history

creator: 62108b4d78be4025ae91ccad3c6d3706:a13caacbe5904dd8a2f730bae1950451 started_at: 2022-07-25 23:01:08.917612 revision_start: 2022-07-25 23:14:18.193541 ended_at: 2022-07-25 23:14:17.496766 user_id: 39c446ae6c9782bf05d17f0018e172e2edbc5eb48f9f23cfe6fdf4f5c199e3a6 project_id: 9b3ef74e29bc4ec3a6b5452b504b802d original_resource_id: d14fbca6-e5a2-4906-b4d9-aa1c9a84673d revision: 24922826 id: 0xD14FBCA6E5A24906B4D9AA1C9A84673D revision_end: 2022-07-25 23:14:18.322208 type: volume

table metric

MariaDB [gnocchi]> select HEX(id) from metric where id='0xD14FBCA6E5A24906B4D9AA1C9A84673D'; Empty set (0.000 sec)

tobias-urdin commented 2 years ago

Try with select * from metric where hex(resource_id) = '0xD14FBCA6E5A24906B4D9AA1C9A84673D'; or skip hex() call if you use --binary-as-hex, you should get the metric (if any) associated with that resource history entry

tomm144 commented 2 years ago

@tobias-urdin seems there is no metric.

/home/ubuntu# mysql --binary-as-hex Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 46 Server version: 10.3.36-MariaDB-1:10.3.36+maria~ubu2004-log mariadb.org binary distribution Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [gnocchi]> select * from metric where resource_id = '0xD14FBCA6E5A24906B4D9AA1C9A84673D'; Empty set (0.031 sec)

tobias-urdin commented 2 years ago

So yeah like I thought in https://github.com/gnocchixyz/gnocchi/issues/1182#issuecomment-1040834877 this is currently a flaw in Gnocchi.

We need to introduce either 1) logic that purges orphaned resource history entries post delete or 2) in delete_resource() and delete_resources() that is called in REST API controller here [1] and here [2] and update that to remove resource history, set metric status to delete and then delete resource from indexer.

I don't currently have time to look into this right now but perhaps can in the near future.

[1] https://github.com/gnocchixyz/gnocchi/blob/master/gnocchi/rest/api.py#L1111 [2] https://github.com/gnocchixyz/gnocchi/blob/master/gnocchi/rest/api.py#L1236

stefanlupsa commented 1 year ago

I can confirm this is also happening in 4.4.1 on a Xena Openstack deployment with resourcehistory and rt[...]_history (instance resource type for nova) tables growing significantly. Currently they add up to 13GB where as in 4.3.4 on a deployment with Train this behavior is not present.

resource_history holds mostly duplicated data with only revision and revision timestamps changing across entries.

tobias-urdin commented 1 year ago

This bug is about resource history not being properly deleted in indexer when a resource is deleted.

I've seen the same behaviour that you are describing when having a resource with metadata/attributes that is changing a lot when polled causing increase of resource history per iteration, in our case that was caused by a bug/issue that kept changing the display_name field.

Can you correlate any specific differences between resource histories on a resource?

From our internal commit msg:

Nova changed the XML format as was fixed in Ceilometer [1]
but we never fixed this for our custom metrics so we had
a flapping behavior on the display_name metadata which caused
resource history to grow on each iteration causing the DB to
be bigger and bigger.

[1] https://review.opendev.org/c/openstack/ceilometer/+/827967
chvalean commented 1 year ago

Adding some more feedback as we have also observed this. It might not be tied to a gnocchi version necessarily, as we have this problem in gnocchi 4.4.1 but not (or at least not to the same extent) in v4.3.4. The initial report mentions having this with gnocchi 4.3.3.