matrix-org / rust-synapse-compress-state

A tool to compress some state in a Synapse instance's database
https://pypi.org/project/synapse-auto-compressor/
Apache License 2.0
142 stars 32 forks source link

synapse_compress_state killed with exit code 137 #132

Open tateisu opened 7 months ago

tateisu commented 7 months ago

Actual behavior

synapse_compress_state killed with exit code 137.

Expected behavior

Not killed , output SQL file.

Console output

synapse_compress_state -p "dbname=XXX host=XXX password=XXX port=XXX user=XXX" -r '!OGEhHVWSdvArJzumhm:matrix.org' -m 1000 -o 'tmp27730.sql' -t
Fetching state from DB for room '!OGEhHVWSdvArJzumhm:matrix.org'...
  [3h] 1060286039 rows retrieved
Got initial state from database. Checking for any missing state groups...
Fetched state groups up to 3494995
Number of state groups: 67603
Number of rows in current table: 1060265013
Compressing state...
[03:02:12] ██████████████████░░ 61882/67603 state groups
Killed
exitCode 137 (this line is output from wrapper script )

Version of synapse_compress_state

https://github.com/matrix-org/rust-synapse-compress-state/commit/bf92c82b7fbd2db39e5af2a4b0739b375358f2ff

commit bf92c82b7fbd2db39e5af2a4b0739b375358f2ff (HEAD -> main, origin/main, origin/HEAD)
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Wed Nov 29 10:23:23 2023 +0000

Room info

!OGEhHVWSdvArJzumhm:matrix.org
Matrix HQ
#matrix:matrix.org
creator=@abuse:matrix.org
state_events=96322
members=2(local) / 45898(total)

Environment

Ubuntu 22.04.3 LTS 

# lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         43 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  12
  On-line CPU(s) list:   0-11
(snip...)

# free -h
               total        used        free      shared  buff/cache   available
Mem:            62Gi       2.1Gi        45Gi       7.8Gi        14Gi        52Gi
Swap:           71Gi       1.2Gi        70Gi

# df -TH
Filesystem     Type     Size  Used Avail Use% Mounted on
/dev/nvme0n1p2 ext4     4.1T  500G  3.4T  14% /
(snip...)
tateisu commented 7 months ago

exit code 137 means SIGKILL

tateisu commented 7 months ago

[ +0.000002] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=nginx.service,mems_allowed=0,global_oom,task_memcg=/user.slice/user-1000.slice/session-24.scope,task=synapse_compres,pid=90350,uid=0 [ +0.000012] Out of memory: Killed process 90350 (synapse_compres) total-vm:157345312kB, anon-rss:54660464kB, file-rss:380kB, shmem-rss:0kB, UID:0 pgtables:279580kB oom_score_adj:0

erikjohnston commented 6 months ago

Looks like it ran out of memory and got killed. Currently the way to get round that is by using a combination of -b and -s options to run against subsets of the state.