Incremental / partial compression

matrix-org / rust-synapse-compress-state

A tool to compress some state in a Synapse instance's database

https://pypi.org/project/synapse-auto-compressor/

Apache License 2.0

143 stars 33 forks source link

Incremental / partial compression #6

Open Valodim opened 5 years ago

Valodim commented 5 years ago

Right now, compressing a room requires holding its entire state in RAM, which can become arbitrarily large.

I don't know the internals of the compression algorithm, but given that state groups seem to be replaced one by one in an idempotent fashion, would it be possible to compress a room only partially somehow? Perhaps only the first N percent of rows, and do it a couple of times? Or maybe have multiple iterations of load, compress, free?

That would be tremendously helpful :+1:

richvdh commented 5 years ago

We've successfully run this on some huge rooms. Is it actually a problem in practice?

Valodim commented 5 years ago

Phew. I have to admit, I don't really remember the specifics back from February :|

Assuming a homeserver that isn't massively overprovisioned, extraordinarily allocating a couple hundred megs or even some gigs of RAM to a process for several hours is indeed not simple. I remember having to run this on another machine for a couple of my largest rooms that reason.

richvdh commented 5 years ago

Ah yes, running it on a separate machine is probably a good idea. You can always tunnel the postgres connection over SSH.

peterhoeg commented 4 years ago

I'm trying to compress a room with 430m state records which runs out of memory when run from a machine with 32GB memory (both the server and the machine from which this runs have 32GB) so I would argue that it is still a case.

The other side of this is that I don't really care how long it takes to run (well, almost) as long as it does.

MurzNN commented 3 years ago

I have room !GibBpYxFGNraRsZOyl:matrix.org (Techlore - Main) with 60+ millions of state groups (exact count is 64182319), and afraid to run full compression, because it will eat all my RAM :(

kroeckx commented 3 years ago

I have a room with 2 G (2248040941) rows. My estimate is that this will require 200 GB of RAM.