Cleaning nodestore_node table

aminvakil commented 2 years ago

Problem Statement

Over time nodestore_node table gets bigger and bigger and currently there is no procedure to clean it up.

A forum comment explaining what nodestore_node is by @untitaker :

To answer the question about purpose of nodestore:

nodestore stores the raw, unindexed parts of the data of an event. For example all event breadcrumbs and contexts are stored there. Clickhouse only stores the data you can search and aggregate by. It only has indices, so to speak.

nodestore has multiple backends. On sentry.io it is not postgres, but google bigtable. We use the postgres backend for local development and for onpremise, but for large-scale installations it is suboptimal.

We don’t have a solution right now for the storage problem in any case, sorry. One thing you should ensure actually runs is sentry cleanup, but it’s definitely possible that nodestore takes this amount of disk space.

https://forum.sentry.io/t/postgres-nodestore-node-table-124gb/12753/3

Solution Brainstorm

There was an idea suggested on forum which worked for me, but I lost all event details. Something like this would work:

DELETE FROM public.nodestore_node WHERE “timestamp” < NOW() - INTERVAL '1 day';
VACUUM FULL public.nodestore_node;

Change 1 day according to your needs.

Maybe put this in a cron container which gets run every night, we should think about its performance issues though, this took a long time to get executed on our instance, maybe because it wasn't run before, but I'm not sure.

aminvakil commented 2 years ago

cc @BYK @chadwhitacre @hubertdeng123

BYK commented 2 years ago

There are a few interesting and important things:

We can never add the VACUUM FULL public.nodestore_node; part in any kind of production setup as it will lock the whole table for an indefinite time. It also requires additional space on disk so it has a high potential to put the system in an inoperable and hard-to-recover state.
The sentry cleanup command that we trigger via the cron does run nodestore.cleanup(): https://github.com/getsentry/sentry/blob/1d24b160a74e10759d20fa323f5251bffee3f7b5/src/sentry/runner/commands/cleanup.py#L260-L268 which is defined at https://github.com/getsentry/sentry/blob/1d24b160a74e10759d20fa323f5251bffee3f7b5/src/sentry/nodestore/django/backend.py#L54-L62

So technically, this should work, if you set SENTRY_RETENTION_DAYS to 90 or less which is the default. The only missing thing I see here is an explicit VACUUM call but autovacuum should handle that already: https://www.postgresql.org/docs/12/routine-vacuuming.html#AUTOVACUUM

Now we need to figure out which part of the puzzle is broken:

Do we have the nodestore.cleanup() call successfully?
If so, does it actually run the DELETE commands?
If so, does the autovacuum thingy work in the Postgres container or do we need to enable that
Even then, should we issue a manual VACUUM command after a cleanup operation?

untitaker commented 2 years ago

I have nothing to add to what BYK said. This should work out of the box assuming you run sentry cleanup

chadwhitacre commented 2 years ago

I lost all event details.

Did you lose all event details because you ran the command without substituting a real value for SENTRY_RETENTION_DAYS, is that what you mean? Does Postgres silently accept INTERVAL ‘SENTRY_RETENTION_DAYS’ as equivalent to zero, something like that?

github-actions[bot] commented 1 year ago

This issue has gone three weeks without activity. In another week, I will close it.

But! If you comment or otherwise update it, I will reset the clock, and if you label it Status: Backlog or Status: In Progress, I will leave it alone ... forever!

"A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀

aminvakil commented 1 year ago

I lost all event details.

Did you lose all event details because you ran the command without substituting a real value for SENTRY_RETENTION_DAYS, is that what you mean? Does Postgres silently accept INTERVAL ‘SENTRY_RETENTION_DAYS’ as equivalent to zero, something like that?

Sorry for late response, No, I ran this exact same command: DELETE FROM public.nodestore_node WHERE "timestamp" < NOW() - INTERVAL '1 day';

github-actions[bot] commented 1 year ago

This issue has gone three weeks without activity. In another week, I will close it.

But! If you comment or otherwise update it, I will reset the clock, and if you label it Status: Backlog or Status: In Progress, I will leave it alone ... forever!

"A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀

mrmachine commented 1 year ago

We hit this problem as well. nodestore_node was 42GB and the table had never been vacuumed.

We ran the sentry-cleanup service which did it's thing, but did not reduce disk space. We ran vacuum analyze which did not reduce disk space. We ran vacuum full nodestore_node and that did reduce disk space to 3GB. Postgres also then reported that vacuum had been executed one time.

postgres=# select relname, vacuum_count, n_tup_del, n_live_tup, n_dead_tup 
from pg_stat_sys_tables 
where relname='pg_toast_20247';
    relname     | vacuum_count | n_tup_del | n_live_tup | n_dead_tup 
----------------+--------------+-----------+------------+------------
 pg_toast_20247 |            1 |  62889277 |    1840323 |          0
(1 row)

So it seems that (3) is true. The autovacuum thingy inside the postgres container is not working.

However, this seems to indicate that it is in fact enabled:

postgres=# SHOW autovacuum;
 autovacuum 
------------
 on
(1 row)

Someone hinted on a forum that if Postgres is under load it may not get a chance to autovacuum. I don't think our installation is especially high load, but if it is the case that Postgres never runs autovacuum under nominal load then maybe sentry self-hosted does need a way to explicitly trigger vacuum on some kind of schedule.

While it is true that vacuum locks the table and uses some disk space, and temporarily doubling our 42GB table during a vacuum might have causes some issues, it only took a couple of minutes to complete the vacuum and it significantly dropped the disk usage.

If vacuum was run regularly, the size should also hover at a much lower level (e.g. 3GB instead of 42GB) and should not get large enough to cause significant issues with disk usage during the vacuum.

We would be happy to accept a periodic 2 minute complete sentry outage to avoid unbounded disk usage, but it seems that sentry did not suffer a complete outage during the vacuum anyway. We're also willing to accept a partial outage or some amount of events that are missing or missing data.

We could also schedule vacuum to occur during a specified maintenance window to further reduce the impact to production systems.

I also tried the pg_repack option mentioned in the docks, but the current command did not run at all (failed to install pg_repack inside the container) and an older version of the command I found in a GitHub issue that matched our version of postgres also failed to install inside the container.

So I think a setting to schedule a vacuum full nodestore_node in postgres is the best way forward. It could be disabled by default, but should be easy to enable via setting that allows a specific schedule to be set.

mrmachine commented 1 year ago

Reading more about vacuum and vacuum full. The latter rewrites the whole table and returns disk space back to the operating system. The former frees up deleted rows for re-use but does not return disk space back to the operating system. So if the sentry cleanup could run vacuum immediately after deleting rows from nodestore_node, and the cleanup frequency was high enough (or configurable) for the volume of new writes vs deletions of old data, then there should be no more unbounded growth without requiring a full table lock or any downtime. This should be easier to implement and more reliable than trying to configure the autovacuum daemon to do it for us?

https://www.postgresql.org/docs/12/routine-vacuuming.html#VACUUM-FOR-SPACE-RECOVERY

chadwhitacre commented 1 year ago

Thanks for the deep dive, @mrmachine.

aminvakil commented 1 year ago

In our environment, we have not yet upgraded to 23.4.0 as disk space is almost 80% full and cannot upgrade our postgres.

Therefore we have not yet tested if upgrading to 14.5 would have an effect on this matter.

lcd1232 commented 3 months ago

It's possible to free up space without downtime using pg_repack

klboke commented 3 months ago

We also encountered this issue, but it was more severe for us. Our nodestore_node table expanded to 3.6TB, so we had to focus on and resolve this problem. We tried to pull a new instance from backup data to verify how long a VACUUM FULL nodestore_node would take and found that it would take at least 8 hours. This means that if we perform this operation in production, Sentry would be down for 8 hours, which is unacceptable.
However, we clearly understand that nodestore_node is just a KV (key-value) abstract storage, using id as the key, to store or query event JSONs, each JSON being approximately 6~70 KB. Of course, the data stored in nodestore_node is already compressed. So I naturally thought of using an object storage product like S3, which can very well solve the storage problem of nodestore_node, using id as the file path, with each event JSON being a file.
Now, I have managed to use S3 to store the messages in nodestore_node and have submitted a PR :https://github.com/getsentry/sentry/pull/76250. I hope others facing similar issues can provide some suggestions.

postgresql size

oss files

getsentry / self-hosted

Cleaning nodestore_node table #1808

Problem Statement

Solution Brainstorm