Closed Ivansete-status closed 2 months ago
If the cleaning (vacuuming?) require space it should also be taken into account.
If the cleaning (vacuuming?) require space it should also be taken into account.
What do you mean sorry?
Some useful queries for this task:
-- sqlite
SELECT page_count * page_size as size FROM pragma_page_count(), pragma_page_size();
This will return total DB size, including free pages, so it should take into account VACUUM
-- postgresql
SELECT pg_size_pretty( pg_database_size('dbname') );
If the cleaning (vacuuming?) require space it should also be taken into account.
What do you mean sorry?
The algo. that delete and move messages around require memory and disk space. If the node is already at max disk capacity it could fail?
If the cleaning (vacuuming?) require space it should also be taken into account.
What do you mean sorry?
The algo. that delete and move messages around require memory and disk space. If the node is already at max disk capacity it could fail?
ah yes! You are absolutely right! The user might need to leave some space. Interesting to add this recommendation in the wakunode2
description parameter.
Added the changes, with the test case, and tested on the machine several times. Kindly go through it!
Weekly update:
Added the new retention policy based on DB size.
Users can provide the size such as --store-message-retention-policy=size:25GB
--store-message-retention-policy=size:25gb
--store-message-retention-policy=size:250MB
--store-message-retention-policy=size:250mb
Test case also added.
Outdated messages/rows are deleted to suffice the size limit, with 20% size reduction upon overflowing.
Weekly update: In review: the database bug to delete limited messages/rows Upcoming/working: updated retention policy + test + missing tes on timestamp based retention policy
Undergoing: MUID concept on message level
Closing this issue, seems feature complete. Also made changes related to Infra-level testing.
Hey @ABresting! Although it is awesome the implementation so far, I think we cannot close this issue yet because we need to double-check that it works correctly on Postgres.
Can be completely de-prioritized per nwaku PM meeting 2023-11-07.
Does not seem to work for PostgreSQL:
ESC[36mnwaku_1 |ESC[0m DBG 2023-11-28 14:11:03.624+00:00 executing message retention policy topics="waku node" tid=1 file=waku_node.nim:753
ESC[36mnwaku_1 |ESC[0m ERR 2023-11-28 14:11:03.662+00:00 exception in getInt, parseInt tid=1 file=postgres_driver.nim:424 error="invalid integer: "
export EXTRA_ARGS="--store-message-retention-policy=size:30GB"
Justification of the de-prioritization?
The current implementation is correct from the nwaku
point of view, but we need an additional action to effectively reduce the size occupied by the database.
The database size is not properly updated when deleting rows as per how Postgres works. i.e. the rows are marked as "deletable". Therefore, we have these options:
VACUUM
. This doesn't give space back to the OS. Instead, the freed space is reserved to be used again by Postgres.VACUUM FULL
. This frees space stored by the database and brings it back to the OS but has a very big drawback, which is that it blocks the whole database for writing and reading. pg_repack
. This should be done manually and the Postgres database needs to have this extension installed.@fryorcraken @ABresting @apentori @jakubgs @yakimant - what is the best approach from your point of view to maintain the Postgres database size under a certain level?
When using (1) I guess the trick is to ensure it is performed before the database reaches maximum size, right? So we may need to operate with some buffer where we delete and VACUUM before max size is reached. E..g delete 20% of msg and RUN VACUUM when database reached 95% of max size.
the 20% need to be selected so that the VACUUM Works efficiently and also, we don't end up VACUUM ing every 10 s.
When using (1) I guess the trick is to ensure it is performed before the database reaches maximum size, right? So we may need to operate with some buffer where we delete and VACUUM before max size is reached. E..g delete 20% of msg and RUN VACUUM when database reached 95% of max size.
the 20% need to be selected so that the VACUUM Works efficiently and also, we don't end up VACUUM ing every 10 s.
I believe that initially it was designed to work like this, when retention limit is reached, remove the 20% of rows.
Like you suggest we can now assume if user has given X size limit then from this we can calculate the new retention limit (95% of input) just to be on the safer side. BUT the issue before was with SQLite, may be with this PR it has been resolved.
Hey there!
imo, the approach 1 - VACUUM would be the best option but we need to adapt the code because right now it would apply the database reduction each time the retention policy is executed. Let me elaborate with an example:
1.b > 1.a
, and then 20% of the rows are deletedVACUUM
operation doesn't return the space to the OS)
c. the database has 80 rows ( cc @fryorcraken @ABresting )
@Ivansete-status, I need to read more to understand PG vacuuming. But as from you desctription - I like approach 1.
BTW, did you have a look at autovacuum
?
https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM
BTW, did you have a look at
autovacuum
? https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM
Thanks for the comments and suggestion! At first I was kind of reluctant of having an automatic process but is something it worth considering too. Apparently it doesn't cause too much read/write blocks: https://stackoverflow.com/questions/73655953/is-postgres-autovacuum-process-blocking-reading-and-writing-the-table#:~:text=It%20will%20take%20light%2Dweight,block%20ordinary%20reads%20and%20writes.
I don't really see the point of an automatic VACUUM FULL
if that space is going to get filled again shortly.
We don't care if filesystem space is used, we care that we don't hit the filesystem size limit and crash and burn.
Best article on SQLITE vacuuming I've seen: https://blogs.gnome.org/jnelson/2015/01/06/sqlite-vacuum-and-auto_vacuum/ Posting for history.
Note we are rescoping this under https://github.com/waku-org/pm/issues/119 to match new process.
I'm closing this issue as per the work made in https://github.com/waku-org/nwaku/pull/2506
Background
We came across an issue in HK-prod node where the SQLite database reached 39GB. We need a mechanism to apply a retention policy so that the database doesn't grow more than a certain amount of GB, or MB.
Acceptance criteria
The Waku node with Store mounted should be able to limit the maximum space occupied by the underlying database. This applies to both sqlite and postgres.
The new retention policy should support the next formats: