Open MayeulC opened 1 year ago
This suggests that level 3 is probably good enough
According to my personal benchmarks, it is. Higher levels result in high speed loss compare to a minimal compression gain, lower level results in minimal speed gains for not that much improved speed. Even on a Raspberry Pi, not a powerful computer.
and compressing barely adds any time to the operation, at least on my (powerful, relatively slow disk for the backup partition) machine.
This will reduce most of the time speed up the backup, as you will write massively less data with a small CPU overhead.
Is compression compatible with borg incremental backups? What do you think @zamentur ?
It's probably compatible (partially de-duplicable) if compressing with --rsyncable
, as offered by zstd and gzip: https://github.com/facebook/zstd/issues/1155
I'm not sure about borg's deduplication algorithm, but I imagine it works similarly to rsync, chunking files based on a rolling hash. This should work with --rsyncable
.
Edit: yes, it should work. borg uses a rolling hash (buzhash/cyclic polynomial), so it creates content-based chunks. If large portions of input data do not change, zstd --rsyncable
will output the same compressed data, which can then be de-duplicated.
Edit 2: example use-case with borg and gzip --rsyncable
: https://gitlab.archlinux.org/archlinux/infrastructure/-/merge_requests/429
Is compression compatible with borg incremental backups?
What do you mean by "compatible" ? The file will be backed-up, either compressed or not. But if it will probably be recognized as a different file, hence not de-duplicated. The backup works, but you loose storage.
:up: :slightly_smiling_face:
But if it will probably be recognized as a different file, hence not de-duplicated
Borg works per chunk (part of a file), not per file. Creating an archive with zstd --rsyncable
should produce as many identical chunks as possible, so it will get deduplicated (and can be rsynced faster).
With the new synapse upgrade, I hit this again (38GB). Since the upgrade failed, I am now 3 hours in, and counting.
Very nice suggestion (db backup compression). We should have that on each apps i guess.
So i added a ticket for that in yunohost issue tracker cause it's a general usecase for all backup, and it could be very useful.
On my side i find an other way to deal with it in an other way: reduce the size of the db ^^ https://levans.fr/shrink-synapse-database.html
Well imho, all the backups (parts that could be significantly compressed) should be compressed by default, or at least we should have that possibility… well, we discussed that on the forum.
But yeah, lacking that feature, compressing db export just makes a lot of sense, as it's often very easy to compress and as the storage gain can be very significant. Integrating this into the core would be a very nice improvement. (restore the compressed backup functionality too 😇)
Exactly. My database is 200GB so I can never even update Synapse from the YNH panel since it will take an hour or so. I do it via SSH with the option to skip the backups simply because of the database being enormous. And when Borg backs-up it takes 200GB on my server space before uploading to Borg, which is again massive. Synapse is the only YNH app that we have which makes it not possible to deal with via YNH Panel properly.
Out of curiosity, how did it grow to such an enormous value ? Are files and media included in the DB ?
Out of curiosity, how did it grow to such an enormous value ? Are files and media included in the DB ?
I store the media in a separate folder that's just 14GB in size. We have around 500 users. Maybe that's why!? Is there a way to see what takes up so much?
EDIT: I see this https://matrix-org.github.io/synapse/latest/usage/administration/database_maintenance_tools.html and this https://levans.fr/shrink-synapse-database.html - I wonder how to get the access token with the YNH install. Seems a bit of work but I will try it. If anyone knows a more simple solution, like a script that does these automatically, please let me know.
Uhh
Table Name | Total | Index | Table |
---|---|---|---|
state_groups_state | 262 GB | 96 GB | 166 GB |
event_json | 10 GB | 727 MB | 9794 MB |
events | 5928 MB | 4052 MB | 1876 MB |
event_edges | 3376 MB | 2104 MB | 1272 MB |
device_lists_changes_in_room | 2803 MB | 1607 MB | 1196 MB |
event_auth | 2773 MB | 1029 MB | 1744 MB |
received_transactions | 2636 MB | 1391 MB | 1245 MB |
event_search | 2113 MB | 877 MB | 1236 MB |
room_memberships | 1613 MB | 934 MB | 679 MB |
event_to_state_groups | 1230 MB | 722 MB | 508 MB |
current_state_delta_stream | 892 MB | 116 MB | 776 MB |
event_auth_chain_links | 850 MB | 346 MB | 504 MB |
receipts_linearized | 846 MB | 785 MB | 61 MB |
state_events | 805 MB | 294 MB | 511 MB |
event_auth_chains | 755 MB | 447 MB | 308 MB |
device_inbox | 606 MB | 118 MB | 488 MB |
event_relations | 554 MB | 340 MB | 214 MB |
device_lists_stream | 487 MB | 415 MB | 72 MB |
current_state_events | 471 MB | 334 MB | 137 MB |
presence_stream | 441 MB | 418 MB | 23 MB |
cache_invalidation_stream_by_instance | 411 MB | 366 MB | 45 MB |
stream_ordering_to_exterm | 384 MB | 312 MB | 72 MB |
device_lists_remote_cache | 279 MB | 32 MB | 247 MB |
state_groups | 264 MB | 110 MB | 155 MB |
e2e_cross_signing_keys | 201 MB | 23 MB | 179 MB |
users_in_public_rooms | 185 MB | 127 MB | 58 MB |
user_directory_search | 134 MB | 76 MB | 58 MB |
state_group_edges | 127 MB | 68 MB | 59 MB |
receipts_graph | 110 MB | 55 MB | 55 MB |
user_directory | 103 MB | 61 MB | 41 MB |
event_push_actions | 39 MB | 35 MB | 3744 kB |
device_lists_remote_pending | 37 MB | 21 MB | 16 MB |
device_lists_outbound_pokes | 33 MB | 20 MB | 12 MB |
redactions | 29 MB | 17 MB | 12 MB |
e2e_room_keys | 29 MB | 4616 kB | 24 MB |
remote_media_cache_thumbnails | 24 MB | 11 MB | 12 MB |
users_who_share_private_rooms | 22 MB | 17 MB | 4888 kB |
server_keys_json | 18 MB | 2424 kB | 16 MB |
e2e_one_time_keys_json | 14 MB | 3864 kB | 10 MB |
destination_rooms | 10088 kB | 7416 kB | 2672 kB |
event_backward_extremities | 9168 kB | 6464 kB | 2704 kB |
device_lists_remote_extremeties | 8872 kB | 3496 kB | 5376 kB |
remote_media_cache | 6832 kB | 2104 kB | 4728 kB |
server_signature_keys | 5392 kB | 1648 kB | 3744 kB |
event_failed_pull_attempts | 5336 kB | 1952 kB | 3384 kB |
event_forward_extremities | 5080 kB | 4880 kB | 200 kB |
user_daily_visits | 4784 kB | 1720 kB | 3064 kB |
local_media_repository | 4168 kB | 2904 kB | 1264 kB |
federation_inbound_events_staging | 4048 kB | 3768 kB | 280 kB |
local_media_repository_thumbnails | 3648 kB | 2520 kB | 1128 kB |
ex_outlier_stream | 3504 kB | 768 kB | 2736 kB |
user_ips | 3056 kB | 2448 kB | 608 kB |
device_lists_outbound_last_success | 2848 kB | 1160 kB | 1688 kB |
threads | 1776 kB | 1008 kB | 768 kB |
destinations | 1736 kB | 568 kB | 1168 kB |
room_account_data | 992 kB | 736 kB | 256 kB |
device_federation_inbox | 992 kB | 392 kB | 600 kB |
user_directory_stale_remote_users | 960 kB | 648 kB | 312 kB |
event_txn_id | 896 kB | 792 kB | 104 kB |
account_data | 680 kB | 352 kB | 328 kB |
local_current_membership | 640 kB | 320 kB | 320 kB |
devices | 616 kB | 168 kB | 448 kB |
e2e_cross_signing_signatures | 472 kB | 136 kB | 336 kB |
stream_positions | 472 kB | 40 kB | 432 kB |
access_tokens | 424 kB | 224 kB | 200 kB |
event_txn_id_device_id | 416 kB | 312 kB | 104 kB |
event_auth_chain_to_calculate | 416 kB | 288 kB | 128 kB |
device_lists_remote_resync | 408 kB | 208 kB | 200 kB |
event_push_summary | 376 kB | 152 kB | 224 kB |
e2e_device_keys_json | 376 kB | 56 kB | 320 kB |
push_rules_stream | 304 kB | 104 kB | 200 kB |
event_push_actions_staging | 256 kB | 232 kB | 24 kB |
e2e_fallback_keys_json | 256 kB | 64 kB | 192 kB |
profiles | 248 kB | 120 kB | 128 kB |
room_stats_state | 248 kB | 64 kB | 184 kB |
rooms | 232 kB | 120 kB | 112 kB |
room_stats_current | 224 kB | 64 kB | 160 kB |
local_media_repository_url_cache | 224 kB | 152 kB | 72 kB |
e2e_room_keys_versions | 216 kB | 16 kB | 200 kB |
users | 216 kB | 88 kB | 128 kB |
user_filters | 200 kB | 88 kB | 112 kB |
worker_read_write_locks | 192 kB | 136 kB | 56 kB |
worker_read_write_locks_mode | 184 kB | 128 kB | 56 kB |
pushers | 184 kB | 64 kB | 120 kB |
partial_state_rooms_servers | 184 kB | 136 kB | 48 kB |
room_depth | 176 kB | 64 kB | 112 kB |
open_id_tokens | 176 kB | 88 kB | 88 kB |
device_federation_outbox | 176 kB | 32 kB | 144 kB |
user_stats_current | 168 kB | 56 kB | 112 kB |
rejections | 168 kB | 56 kB | 112 kB |
push_rules_enable | 152 kB | 80 kB | 72 kB |
worker_locks | 152 kB | 104 kB | 48 kB |
push_rules | 152 kB | 72 kB | 80 kB |
ui_auth_sessions_ips | 128 kB | 80 kB | 48 kB |
ui_auth_sessions | 120 kB | 56 kB | 64 kB |
user_threepids | 112 kB | 48 kB | 64 kB |
ui_auth_sessions_credentials | 104 kB | 56 kB |
I think we need this tool to compress the Synapse table https://github.com/matrix-org/rust-synapse-compress-state - even the devs recognized this is an issue and made this tool. Any way to have it packaged for YNH?
Ok after days of optimization I managed this:
Relation | Total Size |
---|---|
public.state_groups_state | 76 GB |
public.event_json | 10 GB |
public.events | 4327 MB |
public.event_edges | 2910 MB |
public.device_lists_changes_in_room | 2482 MB |
public.event_auth | 2064 MB |
public.event_search | 1778 MB |
public.room_memberships | 1171 MB |
public.event_to_state_groups | 971 MB |
public.current_state_delta_stream | 822 MB |
public.event_auth_chain_links | 761 MB |
public.state_events | 730 MB |
public.received_transactions | 674 MB |
public.event_auth_chains | 632 MB |
public.event_relations | 424 MB |
public.current_state_events | 297 MB |
public.device_lists_remote_cache | 220 MB |
public.device_inbox | 204 MB |
public.e2e_cross_signing_keys | 196 MB |
public.state_groups | 189 MB |
Still a lot! But well 3 times less the size. What I did?
First compress using this https://github.com/matrix-org/rust-synapse-compress-state#building . Need to install that package and run it like:
nohup synapse_auto_compressor -p postgresql://matrix_synapse:db-password-here@localhost/matrix_synapse -c 500 -n 10000 >/var/log/matrix-opt.log 2>&1 &
To run in the background.
The I had to REINDEX:
PGPASSWORD="db-password-here" nohup psql -U matrix_synapse -d matrix_synapse -c "REINDEX (VERBOSE) DATABASE matrix_synapse;" >/var/log/matrix-reindex.log 2>&1 &
And then to VACUUM:
PGPASSWORD="db-password-here" nohup psql -U matrix_synapse -d matrix_synapse -c "VACUUM FULL VERBOSE ANALYZE;" >/var/log/matrix-vacuum.log 2>&1 &
The compression took around 3 days of nonstop compressing and put a huge toll on the CPU. 10 core CPU ad 100%. The rest take around 3-4 hours each. And will take a lot of space on your disc, it will duplicate your database on the server before doing the reindex and vacuuming. Then it will delete it.
I do not know if I can do more to reduce the size. But even at this size it is not easy to manage it via the YNH panel....too big to backup, too big to restore.
Wow, thanks a lot for documenting this !
Did you stopped the synapse service meanwhile ?
You have to stop synapse while you do the Reindex and Vaccum full. Not when you do the compression. Vaccum full is just to restore the diskspace to the server. Basically the compression can say cut down your database from 200GB to 100GB but you wont see that on your disk because postgresql still shows 200 GB and keeps the "empty" space for the database so it will add new stuff to this empty space. idk how to explain basically your database wont grow anymore and the diskspace looks like 200 GB is full with the synapse database....
It's clear for me, thanks :)
Yeah, I used a similar tool a while ago, I also used to have a DB of more than 100G, with myself as the only user 😇
I am uncertain if compression would help a lot with the backup/restore speed, it may be 30% faster but would still likely take hours. However, it would help quite a bit with disk space.
psql -U matrix_synapse
You can also do sudo -u postgres psql
to avoid having to deal with authentication.
It's also possible to drop old room history from the database, for rooms not originating on one's server. The assumption is that synapse will fetch it again from distant servers if needed. But we stray from the topic :)
it may be 30% faster but would still likely take hours. However, it would help quite a bit with disk space.
30% of a couple of hours is already a big deal !
rather than compressing the database dump, why not compress the database itself ?
https://wiki.chatons.org/doku.php/services/messagerie_instantanee/matrix#nettoyage_d_un_serveur_synapse https://jacksonchen666.com/posts/2022-12-03/14-33-00/
The other advantage would be that the synapse_ynh package could kind of "guarantee" that the messages and media retention policy is actually applied by actually cleaning, purging and freeing space in the DB
Wouldn't that hurt performances ?
I think we need this tool to compress the Synapse table https://github.com/matrix-org/rust-synapse-compress-state - even the devs recognized this is an issue and made this tool. Any way to have it packaged for YNH?
maybe the most easiest way would be to just put the executable in the installation path at installation, and upgrade this tool at the same time of synapse if a new version is released?
all people having issues with huge database should check this https://github.com/YunoHost-Apps/synapse_ynh/issues/478
Describe the bug
My synapse database is big. Database dumps can take more than 100 GB. My latest one (single user server) is at 32.8 GB. Writing and reading that file to disk takes a long time, not to mention the wasted disk space.
Backups used to be compressed, but back then they were first tar-ed, then compressed. Both stages took a while.
Suggested solution
Pipe the postgresql dump to a (fast, multithread) compressor:
https://github.com/YunoHost-Apps/synapse_ynh/blob/c80e8fc715d5eb783ddf95752e91eeb9cb0baa45/scripts/backup#L79
Change this to
(Ideally this would combine nice and ionice, like that:)
On the restore side:
https://github.com/YunoHost-Apps/synapse_ynh/blob/c80e8fc715d5eb783ddf95752e91eeb9cb0baa45/scripts/restore#L121
Hmm, not straightforward here. Either make a fifo with mkfifo and pass this as the path, or change the helper/introduce a helper without the redirection there: https://github.com/YunoHost/yunohost/blob/4b9e26b974b0cc8f7aa44fd773537508316b8ba6/helpers/postgresql#L78-L79
Expected gains
zstd level 3 gets an old dump from 17GB to 3.8GB. Level 7 only gets this down to 3.5GB. Level 1 (minimum, fastest) reaches 4.2 GB.
Both archive creation and dump time should be faster as less data needs to be written to disk, especially for hard disks.
Sample runs
These runs were collected with some I/O in the background (synapse restoration in progress).
This suggests that level 3 is probably good enough, and compressing barely adds any time to the operation, at least on my (powerful, relatively slow disk for the backup partition) machine.