YunoHost-Apps / synapse_ynh

Matrix server (synapse) package for YunoHost
https://matrix.org/
GNU General Public License v3.0
79 stars 42 forks source link

Please compress the database dump #354

Open MayeulC opened 1 year ago

MayeulC commented 1 year ago

Describe the bug

My synapse database is big. Database dumps can take more than 100 GB. My latest one (single user server) is at 32.8 GB. Writing and reading that file to disk takes a long time, not to mention the wasted disk space.

Backups used to be compressed, but back then they were first tar-ed, then compressed. Both stages took a while.

Suggested solution

Pipe the postgresql dump to a (fast, multithread) compressor:

https://github.com/YunoHost-Apps/synapse_ynh/blob/c80e8fc715d5eb783ddf95752e91eeb9cb0baa45/scripts/backup#L79

Change this to

ynh_psql_dump_db --database="$synapse_db_name" | zstd -3 -T0 > ${YNH_CWD}/dump.sql 

(Ideally this would combine nice and ionice, like that:)

ynh_psql_dump_db --database="$synapse_db_name" | ionice -n 6 nice zstd -3 -T0 -o "${YNH_CWD}/dump.sql"

On the restore side:

https://github.com/YunoHost-Apps/synapse_ynh/blob/c80e8fc715d5eb783ddf95752e91eeb9cb0baa45/scripts/restore#L121

Hmm, not straightforward here. Either make a fifo with mkfifo and pass this as the path, or change the helper/introduce a helper without the redirection there: https://github.com/YunoHost/yunohost/blob/4b9e26b974b0cc8f7aa44fd773537508316b8ba6/helpers/postgresql#L78-L79

Expected gains

zstd level 3 gets an old dump from 17GB to 3.8GB. Level 7 only gets this down to 3.5GB. Level 1 (minimum, fastest) reaches 4.2 GB.

Both archive creation and dump time should be faster as less data needs to be written to disk, especially for hard disks.

Sample runs

These runs were collected with some I/O in the background (synapse restoration in progress).

# time cat tmp/apps/synapse/backup/dump.sql |zstd -T0 -1 > test1.zst
cat tmp/apps/synapse/backup/dump.sql  0,20s user 20,29s system 14% cpu 2:24,87 total
zstd -T0 -1 > test1.zst  110,79s user 10,31s system 83% cpu 2:24,89 total
# time cat tmp/apps/synapse/backup/dump.sql |zstd -T0 -3 > test3.zst
cat tmp/apps/synapse/backup/dump.sql  0,25s user 15,99s system 10% cpu 2:32,51 total
zstd -T0 -3 > test3.zst  120,20s user 7,80s system 83% cpu 2:32,58 total
# time cat tmp/apps/synapse/backup/dump.sql |zstd -T0 -7 > test7.zst
cat tmp/apps/synapse/backup/dump.sql  0,17s user 16,89s system 8% cpu 3:16,85 total
zstd -T0 -7 > test7.zst  630,73s user 7,80s system 324% cpu 3:17,04 total
# time cat tmp/apps/synapse/backup/dump.sql > test0.sql
cat tmp/apps/synapse/backup/dump.sql > test0.sql  0,17s user 21,84s system 10% cpu 3:33,71 total

This suggests that level 3 is probably good enough, and compressing barely adds any time to the operation, at least on my (powerful, relatively slow disk for the backup partition) machine.

lapineige commented 1 year ago

This suggests that level 3 is probably good enough

According to my personal benchmarks, it is. Higher levels result in high speed loss compare to a minimal compression gain, lower level results in minimal speed gains for not that much improved speed. Even on a Raspberry Pi, not a powerful computer.

and compressing barely adds any time to the operation, at least on my (powerful, relatively slow disk for the backup partition) machine.

This will reduce most of the time speed up the backup, as you will write massively less data with a small CPU overhead.

Gredin67 commented 1 year ago

Is compression compatible with borg incremental backups? What do you think @zamentur ?

MayeulC commented 1 year ago

It's probably compatible (partially de-duplicable) if compressing with --rsyncable, as offered by zstd and gzip: https://github.com/facebook/zstd/issues/1155

I'm not sure about borg's deduplication algorithm, but I imagine it works similarly to rsync, chunking files based on a rolling hash. This should work with --rsyncable.

Edit: yes, it should work. borg uses a rolling hash (buzhash/cyclic polynomial), so it creates content-based chunks. If large portions of input data do not change, zstd --rsyncable will output the same compressed data, which can then be de-duplicated.

Edit 2: example use-case with borg and gzip --rsyncable: https://gitlab.archlinux.org/archlinux/infrastructure/-/merge_requests/429

lapineige commented 1 year ago

Is compression compatible with borg incremental backups?

What do you mean by "compatible" ? The file will be backed-up, either compressed or not. But if it will probably be recognized as a different file, hence not de-duplicated. The backup works, but you loose storage.

lapineige commented 1 year ago

:up: :slightly_smiling_face:

MayeulC commented 1 year ago

But if it will probably be recognized as a different file, hence not de-duplicated

Borg works per chunk (part of a file), not per file. Creating an archive with zstd --rsyncable should produce as many identical chunks as possible, so it will get deduplicated (and can be rsynced faster).

With the new synapse upgrade, I hit this again (38GB). Since the upgrade failed, I am now 3 hours in, and counting.

zamentur commented 1 year ago

Very nice suggestion (db backup compression). We should have that on each apps i guess.

zamentur commented 1 year ago

So i added a ticket for that in yunohost issue tracker cause it's a general usecase for all backup, and it could be very useful.

On my side i find an other way to deal with it in an other way: reduce the size of the db ^^ https://levans.fr/shrink-synapse-database.html

lapineige commented 1 year ago

Well imho, all the backups (parts that could be significantly compressed) should be compressed by default, or at least we should have that possibility… well, we discussed that on the forum.

But yeah, lacking that feature, compressing db export just makes a lot of sense, as it's often very easy to compress and as the storage gain can be very significant. Integrating this into the core would be a very nice improvement. (restore the compressed backup functionality too 😇)

tio-trom commented 1 year ago

Exactly. My database is 200GB so I can never even update Synapse from the YNH panel since it will take an hour or so. I do it via SSH with the option to skip the backups simply because of the database being enormous. And when Borg backs-up it takes 200GB on my server space before uploading to Borg, which is again massive. Synapse is the only YNH app that we have which makes it not possible to deal with via YNH Panel properly.

lapineige commented 1 year ago

Out of curiosity, how did it grow to such an enormous value ? Are files and media included in the DB ?

tio-trom commented 1 year ago

Out of curiosity, how did it grow to such an enormous value ? Are files and media included in the DB ?

I store the media in a separate folder that's just 14GB in size. We have around 500 users. Maybe that's why!? Is there a way to see what takes up so much?

EDIT: I see this https://matrix-org.github.io/synapse/latest/usage/administration/database_maintenance_tools.html and this https://levans.fr/shrink-synapse-database.html - I wonder how to get the access token with the YNH install. Seems a bit of work but I will try it. If anyone knows a more simple solution, like a script that does these automatically, please let me know.

Uhh

Table Name Total Index Table
state_groups_state 262 GB 96 GB 166 GB
event_json 10 GB 727 MB 9794 MB
events 5928 MB 4052 MB 1876 MB
event_edges 3376 MB 2104 MB 1272 MB
device_lists_changes_in_room 2803 MB 1607 MB 1196 MB
event_auth 2773 MB 1029 MB 1744 MB
received_transactions 2636 MB 1391 MB 1245 MB
event_search 2113 MB 877 MB 1236 MB
room_memberships 1613 MB 934 MB 679 MB
event_to_state_groups 1230 MB 722 MB 508 MB
current_state_delta_stream 892 MB 116 MB 776 MB
event_auth_chain_links 850 MB 346 MB 504 MB
receipts_linearized 846 MB 785 MB 61 MB
state_events 805 MB 294 MB 511 MB
event_auth_chains 755 MB 447 MB 308 MB
device_inbox 606 MB 118 MB 488 MB
event_relations 554 MB 340 MB 214 MB
device_lists_stream 487 MB 415 MB 72 MB
current_state_events 471 MB 334 MB 137 MB
presence_stream 441 MB 418 MB 23 MB
cache_invalidation_stream_by_instance 411 MB 366 MB 45 MB
stream_ordering_to_exterm 384 MB 312 MB 72 MB
device_lists_remote_cache 279 MB 32 MB 247 MB
state_groups 264 MB 110 MB 155 MB
e2e_cross_signing_keys 201 MB 23 MB 179 MB
users_in_public_rooms 185 MB 127 MB 58 MB
user_directory_search 134 MB 76 MB 58 MB
state_group_edges 127 MB 68 MB 59 MB
receipts_graph 110 MB 55 MB 55 MB
user_directory 103 MB 61 MB 41 MB
event_push_actions 39 MB 35 MB 3744 kB
device_lists_remote_pending 37 MB 21 MB 16 MB
device_lists_outbound_pokes 33 MB 20 MB 12 MB
redactions 29 MB 17 MB 12 MB
e2e_room_keys 29 MB 4616 kB 24 MB
remote_media_cache_thumbnails 24 MB 11 MB 12 MB
users_who_share_private_rooms 22 MB 17 MB 4888 kB
server_keys_json 18 MB 2424 kB 16 MB
e2e_one_time_keys_json 14 MB 3864 kB 10 MB
destination_rooms 10088 kB 7416 kB 2672 kB
event_backward_extremities 9168 kB 6464 kB 2704 kB
device_lists_remote_extremeties 8872 kB 3496 kB 5376 kB
remote_media_cache 6832 kB 2104 kB 4728 kB
server_signature_keys 5392 kB 1648 kB 3744 kB
event_failed_pull_attempts 5336 kB 1952 kB 3384 kB
event_forward_extremities 5080 kB 4880 kB 200 kB
user_daily_visits 4784 kB 1720 kB 3064 kB
local_media_repository 4168 kB 2904 kB 1264 kB
federation_inbound_events_staging 4048 kB 3768 kB 280 kB
local_media_repository_thumbnails 3648 kB 2520 kB 1128 kB
ex_outlier_stream 3504 kB 768 kB 2736 kB
user_ips 3056 kB 2448 kB 608 kB
device_lists_outbound_last_success 2848 kB 1160 kB 1688 kB
threads 1776 kB 1008 kB 768 kB
destinations 1736 kB 568 kB 1168 kB
room_account_data 992 kB 736 kB 256 kB
device_federation_inbox 992 kB 392 kB 600 kB
user_directory_stale_remote_users 960 kB 648 kB 312 kB
event_txn_id 896 kB 792 kB 104 kB
account_data 680 kB 352 kB 328 kB
local_current_membership 640 kB 320 kB 320 kB
devices 616 kB 168 kB 448 kB
e2e_cross_signing_signatures 472 kB 136 kB 336 kB
stream_positions 472 kB 40 kB 432 kB
access_tokens 424 kB 224 kB 200 kB
event_txn_id_device_id 416 kB 312 kB 104 kB
event_auth_chain_to_calculate 416 kB 288 kB 128 kB
device_lists_remote_resync 408 kB 208 kB 200 kB
event_push_summary 376 kB 152 kB 224 kB
e2e_device_keys_json 376 kB 56 kB 320 kB
push_rules_stream 304 kB 104 kB 200 kB
event_push_actions_staging 256 kB 232 kB 24 kB
e2e_fallback_keys_json 256 kB 64 kB 192 kB
profiles 248 kB 120 kB 128 kB
room_stats_state 248 kB 64 kB 184 kB
rooms 232 kB 120 kB 112 kB
room_stats_current 224 kB 64 kB 160 kB
local_media_repository_url_cache 224 kB 152 kB 72 kB
e2e_room_keys_versions 216 kB 16 kB 200 kB
users 216 kB 88 kB 128 kB
user_filters 200 kB 88 kB 112 kB
worker_read_write_locks 192 kB 136 kB 56 kB
worker_read_write_locks_mode 184 kB 128 kB 56 kB
pushers 184 kB 64 kB 120 kB
partial_state_rooms_servers 184 kB 136 kB 48 kB
room_depth 176 kB 64 kB 112 kB
open_id_tokens 176 kB 88 kB 88 kB
device_federation_outbox 176 kB 32 kB 144 kB
user_stats_current 168 kB 56 kB 112 kB
rejections 168 kB 56 kB 112 kB
push_rules_enable 152 kB 80 kB 72 kB
worker_locks 152 kB 104 kB 48 kB
push_rules 152 kB 72 kB 80 kB
ui_auth_sessions_ips 128 kB 80 kB 48 kB
ui_auth_sessions 120 kB 56 kB 64 kB
user_threepids 112 kB 48 kB 64 kB
ui_auth_sessions_credentials 104 kB 56 kB
tio-trom commented 1 year ago

I think we need this tool to compress the Synapse table https://github.com/matrix-org/rust-synapse-compress-state - even the devs recognized this is an issue and made this tool. Any way to have it packaged for YNH?

tio-trom commented 1 year ago

Ok after days of optimization I managed this:

Relation Total Size
public.state_groups_state 76 GB
public.event_json 10 GB
public.events 4327 MB
public.event_edges 2910 MB
public.device_lists_changes_in_room 2482 MB
public.event_auth 2064 MB
public.event_search 1778 MB
public.room_memberships 1171 MB
public.event_to_state_groups 971 MB
public.current_state_delta_stream 822 MB
public.event_auth_chain_links 761 MB
public.state_events 730 MB
public.received_transactions 674 MB
public.event_auth_chains 632 MB
public.event_relations 424 MB
public.current_state_events 297 MB
public.device_lists_remote_cache 220 MB
public.device_inbox 204 MB
public.e2e_cross_signing_keys 196 MB
public.state_groups 189 MB

Still a lot! But well 3 times less the size. What I did?

First compress using this https://github.com/matrix-org/rust-synapse-compress-state#building . Need to install that package and run it like:

nohup synapse_auto_compressor -p postgresql://matrix_synapse:db-password-here@localhost/matrix_synapse -c 500 -n 10000 >/var/log/matrix-opt.log 2>&1 &

To run in the background.

The I had to REINDEX:

PGPASSWORD="db-password-here" nohup psql -U matrix_synapse -d matrix_synapse -c "REINDEX (VERBOSE) DATABASE matrix_synapse;" >/var/log/matrix-reindex.log 2>&1 &

And then to VACUUM:

PGPASSWORD="db-password-here" nohup psql -U matrix_synapse -d matrix_synapse -c "VACUUM FULL VERBOSE ANALYZE;" >/var/log/matrix-vacuum.log 2>&1 &

The compression took around 3 days of nonstop compressing and put a huge toll on the CPU. 10 core CPU ad 100%. The rest take around 3-4 hours each. And will take a lot of space on your disc, it will duplicate your database on the server before doing the reindex and vacuuming. Then it will delete it.

I do not know if I can do more to reduce the size. But even at this size it is not easy to manage it via the YNH panel....too big to backup, too big to restore.

lapineige commented 1 year ago

Wow, thanks a lot for documenting this !

Did you stopped the synapse service meanwhile ?

tio-trom commented 1 year ago

You have to stop synapse while you do the Reindex and Vaccum full. Not when you do the compression. Vaccum full is just to restore the diskspace to the server. Basically the compression can say cut down your database from 200GB to 100GB but you wont see that on your disk because postgresql still shows 200 GB and keeps the "empty" space for the database so it will add new stuff to this empty space. idk how to explain basically your database wont grow anymore and the diskspace looks like 200 GB is full with the synapse database....

lapineige commented 1 year ago

It's clear for me, thanks :)

MayeulC commented 1 year ago

Yeah, I used a similar tool a while ago, I also used to have a DB of more than 100G, with myself as the only user 😇

I am uncertain if compression would help a lot with the backup/restore speed, it may be 30% faster but would still likely take hours. However, it would help quite a bit with disk space.

psql -U matrix_synapse

You can also do sudo -u postgres psql to avoid having to deal with authentication.

It's also possible to drop old room history from the database, for rooms not originating on one's server. The assumption is that synapse will fetch it again from distant servers if needed. But we stray from the topic :)

lapineige commented 1 year ago

it may be 30% faster but would still likely take hours. However, it would help quite a bit with disk space.

30% of a couple of hours is already a big deal !

Gredin67 commented 11 months ago

rather than compressing the database dump, why not compress the database itself ?

https://wiki.chatons.org/doku.php/services/messagerie_instantanee/matrix#nettoyage_d_un_serveur_synapse https://jacksonchen666.com/posts/2022-12-03/14-33-00/

The other advantage would be that the synapse_ynh package could kind of "guarantee" that the messages and media retention policy is actually applied by actually cleaning, purging and freeing space in the DB

lapineige commented 11 months ago

Wouldn't that hurt performances ?

OniriCorpe commented 11 months ago

I think we need this tool to compress the Synapse table https://github.com/matrix-org/rust-synapse-compress-state - even the devs recognized this is an issue and made this tool. Any way to have it packaged for YNH?

maybe the most easiest way would be to just put the executable in the installation path at installation, and upgrade this tool at the same time of synapse if a new version is released?

Gredin67 commented 3 months ago

all people having issues with huge database should check this https://github.com/YunoHost-Apps/synapse_ynh/issues/478