matrix-org / rust-synapse-compress-state

A tool to compress some state in a Synapse instance's database
https://pypi.org/project/synapse-auto-compressor/
Apache License 2.0
143 stars 33 forks source link

Gracefully handle rooms where the compressor generates more rows than started with #7

Open justyns opened 5 years ago

justyns commented 5 years ago

I ran this against each room in my database to see what sort of changes it would want to make, and I'm wondering if it's normal for it to generate more rows than we started with?

For example:

Fetching state from DB for room '!redacted:matrix.org'...
Got initial state from database. Checking for any missing state groups...
Number of state groups: 108
Number of rows in current table: 187
Compressing state...
Number of rows after compression: 257 (137.43%)
Compression Statistics:
  Number of forced resets due to lacking prev: 2
  Number of compressed rows caused by the above: 23
  Number of state groups changed: 12
Writing changes...
Checking that state maps match...
New state map matches old one

Full list, there are a couple others above 100%:

Number of rows after compression: 9 (100.00%)
Number of rows after compression: 8 (100.00%)
Number of rows after compression: 19 (54.29%)
Number of rows after compression: 26 (96.30%)
Number of rows after compression: 26 (86.67%)
Number of rows after compression: 27 (87.10%)
Number of rows after compression: 257 (137.43%)
Number of rows after compression: 7 (100.00%)
Number of rows after compression: 1292 (58.44%)
Number of rows after compression: 9 (100.00%)
Number of rows after compression: 37 (80.43%)
Number of rows after compression: 137 (125.69%)
Number of rows after compression: 851433 (20.22%)
Number of rows after compression: 27 (108.00%)
Number of rows after compression: 61136 (52.82%)
Number of rows after compression: 1708 (67.75%)
Number of rows after compression: 14 (100.00%)
Number of rows after compression: 6872 (26.59%)
Number of rows after compression: 76916 (99.81%)
Number of rows after compression: 18 (100.00%)
Number of rows after compression: 104922 (81.30%)
Number of rows after compression: 10786 (80.27%)
TeknikalDomain commented 3 years ago

Just to add since I discovered this: Running with the -m option when compression fails like this causes a panic:

Fetching state from DB for room '!UBhRLVEYYvUWnsljRs:matrix.org'...
  [1s] 3951 rows retrieved
Got initial state from database. Checking for any missing state groups...
No missing state groups
Number of state groups: 610
Number of rows in current table: 3945
Compressing state...
[00:00:00] ████████████████████ 610/610 state groups
Number of rows after compression: 7598 (192.60%)
Compression Statistics:
  Number of forced resets due to lacking prev: 18
  Number of compressed rows caused by the above: 3701
  Number of state groups changed: 69
thread 'main' panicked at 'attempt to subtract with overflow', src/main.rs:232:22
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
erikjohnston commented 3 years ago

Yes, it is expected that the compressor may produce slightly more rows, especially if its run on a room that has previously been compressed. We should probably handle that case better, by e.g. having the generated SQL be a no-op.

@TeknikalDomain can you open a new issue for that please?

TeknikalDomain commented 3 years ago

@erikjohnston Done.