yinqiwen / ardb

A redis protocol compatible nosql, it support multiple storage engines as backend like Google's LevelDB, Facebook's RocksDB, OpenLDAP's LMDB, PerconaFT, WiredTiger, ForestDB.
BSD 3-Clause "New" or "Revised" License
1.83k stars 278 forks source link

Rocksdb used too many mem #490

Closed Gourds closed 3 years ago

Gourds commented 3 years ago

My host is 16core 64G DB size

du -sh /opt/db/ardb/data/rocksdb/
16G     /opt/db/ardb/data/rocksdb/

Now used Mem is

 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 19082 root      20   0  172g  43g 7052 S 105.4 69.5 189:26.57 ardb-server

Question: Now the ardb process is often killed by the system due to OOM. How to reduce memory usage

My config:

home  /opt/db/ardb/
daemonize no
pidfile ${ARDB_HOME}/ardb.pid
thread-pool-size         12
server[0].listen              0.0.0.0:16379
qps-limit-per-host                  0
qps-limit-per-connection            0
rocksdb.compaction           OptimizeLevelStyleCompaction
rocksdb.disableWAL            false
rocksdb.options               write_buffer_size=512M;max_write_buffer_number=5;min_write_buffer_number_to_merge=2;compression=kSnappyCompression;\
                              bloom_locality=1;memtable_prefix_bloom_size_ratio=0.1;\
                              block_based_table_factory={block_cache=512M;filter_policy=bloomfilter:10:true};\
                              create_if_missing=true;max_open_files=60000;rate_limiter_bytes_per_sec=200M;\
                              use_direct_io_for_flush_and_compaction=true;use_adaptive_mutex=true;max_total_wal_size=20480M
leveldb.options               block_cache_size=512M,write_buffer_size=128M,max_open_files=5000,block_size=4k,block_restart_interval=16,\
                              bloom_bits=10,compression=snappy,logenable=yes
lmdb.options                  database_maxsize=10G,database_maxdbs=4096,readahead=no,batch_commit_watermark=1024
perconaft.options             cache_size=128M,compression=snappy
wiredtiger.options            cache_size=512M,session_max=8k,chunk_size=100M,block_size=4k,bloom_bits=10,\
                              mmap=false,compressor=snappy
forestdb.options              chunksize=8,blocksize=4K
timeout 0
tcp-keepalive 0
loglevel info
logfile ${ARDB_HOME}/log/ardb-server.log
data-dir ${ARDB_HOME}/data
slave-workers   2
max-slave-worker-queue  1024
repl-dir                          ${ARDB_HOME}/repl
slave-serve-stale-data yes
slave-priority 100
slave-read-only yes
backup-dir                        ${ARDB_HOME}/backup
backup-file-format                ardb
repl-disable-tcp-nodelay no
repl-backlog-size           2G
repl-backlog-cache-size     3000M
snapshot-max-lag-offset     3000M
maxsnapshots                10
slave-serve-stale-data yes
slave-cleardb-before-fullresync    yes
repl-backlog-sync-period         5
slave-ignore-expire   no
slave-ignore-del      no
requirepass 123
cluster-name   ardb-cluster
slave-client-output-buffer-limit 256mb
pubsub-client-output-buffer-limit 32mb
slowlog-log-slower-than 10000
slowlog-max-len 128
lua-time-limit 5000
hll-sparse-max-bytes 3000
scan-redis-compatible         yes
scan-cursor-expire-after      60
scan-total-order              no
redis-compatible-mode     yes
redis-compatible-version  3.2.0
statistics-log-period     600
compact-after-snapshot-load  false
range-delete-min-size  100

info stdout

# Server
ardb_version:0.9.6
redis_version:3.2.0
engine:rocksdb
ardb_home:/opt/db/ardb
os:Linux 4.9.38-16.35.amzn1.x86_64 x86_64
gcc_version:4.8.5
process_id:19082
run_id:278a93e439cdd027bc8321d6eaaa8d4bd78854bf
tcp_port:16379
listen:0.0.0.0:16379
uptime_in_seconds:42904
uptime_in_days:0
executable:/opt/db/ardb/ardb-0.9.4/src/ardb-server
config_file:/etc/ardb.conf

# Databases
data_dir:/opt/db/ardb/data
used_disk_space:15540536383
rocksdb_version:5.8.8
rocksdb.block_table_usage:505663808
rocksdb.block_table_pinned_usage:1032512
rocksdb_memtable_total:1269115184
rocksdb_memtable_unflushed:1113086112
rocksdb_table_readers_total:286148539
rocksdb.estimate-table-readers-mem:0
rocksdb.cur-size-all-mem-tables:14680864

** Compaction Stats [1] **
Level    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
----------------------------------------------------------------------------------------------------------------------------------------------------------
  L0      1/0   424.53 MB   0.8      0.0     0.0      0.0      31.4     31.4       0.0   1.0      0.0     83.9       384        68    5.642       0      0
  L1      8/0   486.64 MB   1.0     47.8    31.0     16.7      33.3     16.5       0.0   1.1     94.9     66.1       515        37   13.924   5171K   518K
  L2     55/0   380.71 MB   0.9     19.9    16.5      3.3       3.4      0.0       0.0   0.2     66.4     11.3       307        47    6.526     11M   165K
  L3     37/0    2.28 GB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0      0.0         0         0    0.000       0      0
 Sum    101/0    3.54 GB   0.0     67.6    47.6     20.1      68.1     48.0       0.0   2.2     57.4     57.8      1206       152    7.932     17M   683K
 Int      0/0    0.00 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0      0.0         0         0    0.000       0      0
Uptime(secs): 42960.6 total, 495.2 interval
Flush(GB): cumulative 31.431, interval 0.000
AddFile(GB): cumulative 0.000, interval 0.000
AddFile(Total Files): cumulative 0, interval 0
AddFile(L0 Files): cumulative 0, interval 0
AddFile(Keys): cumulative 0, interval 0
Cumulative compaction: 68.06 GB write, 1.62 MB/s write, 67.63 GB read, 1.61 MB/s read, 1205.6 seconds
Interval compaction: 0.00 GB write, 0.00 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.0 seconds
Stalls(count): 0 level0_slowdown, 0 level0_slowdown_with_compaction, 0 level0_numfiles, 0 level0_numfiles_with_compaction, 0 stop for pending_compaction_bytes, 0 slowdown for pending_compaction_bytes, 0 memtable_compaction, 0 memtable_slowdown, interval 0 total count

** File Read Latency Histogram By Level [1] **

** DB Stats **
Uptime(secs): 42960.6 total, 495.2 interval
Cumulative writes: 0 writes, 0 keys, 0 commit groups, 0.0 writes per commit group, ingest: 0.00 GB, 0.00 MB/s
Cumulative WAL: 0 writes, 0 syncs, 0.00 writes per sync, written: 0.00 GB, 0.00 MB/s
Cumulative stall: 00:00:0.000 H:M:S, 0.0 percent
Interval writes: 0 writes, 0 keys, 0 commit groups, 0.0 writes per commit group, ingest: 0.00 MB, 0.00 MB/s
Interval WAL: 0 writes, 0 syncs, 0.00 writes per sync, written: 0.00 MB, 0.00 MB/s
Interval stall: 00:00:0.000 H:M:S, 0.0 percent

** Compaction Stats [2] **
Level    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
----------------------------------------------------------------------------------------------------------------------------------------------------------
  L0      1/0   406.39 MB   0.8      0.0     0.0      0.0      29.7     29.7       0.0   1.0      0.0     94.2       323        68    4.749       0      0
  L1      8/0   484.59 MB   0.9     45.5    29.7     15.8      30.2     14.4       0.0   1.0     76.3     50.7       611        34   17.977   5959K   497K
  L2     18/0   331.92 MB   0.8     17.0    14.4      2.6       2.6      0.0       0.0   0.2     60.6      9.3       287        41    7.004     19M   102K
  L3     37/0    2.30 GB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0      0.0         0         0    0.000       0      0
 Sum     64/0    3.49 GB   0.0     62.5    44.1     18.4      62.5     44.2       0.0   2.1     52.4     52.4      1221       143    8.540     25M   599K
 Int      0/0    0.00 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0      0.0         0         0    0.000       0      0
Uptime(secs): 42960.6 total, 495.2 interval
Flush(GB): cumulative 29.702, interval 0.000
AddFile(GB): cumulative 0.000, interval 0.000
AddFile(Total Files): cumulative 0, interval 0
AddFile(L0 Files): cumulative 0, interval 0
AddFile(Keys): cumulative 0, interval 0
Cumulative compaction: 62.55 GB write, 1.49 MB/s write, 62.51 GB read, 1.49 MB/s read, 1221.3 seconds
Interval compaction: 0.00 GB write, 0.00 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.0 seconds
Stalls(count): 0 level0_slowdown, 0 level0_slowdown_with_compaction, 0 level0_numfiles, 0 level0_numfiles_with_compaction, 0 stop for pending_compaction_bytes, 0 slowdown for pending_compaction_bytes, 0 memtable_compaction, 0 memtable_slowdown, interval 0 total count

** File Read Latency Histogram By Level [2] **

** DB Stats **
Uptime(secs): 42960.6 total, 495.2 interval
Cumulative writes: 0 writes, 0 keys, 0 commit groups, 0.0 writes per commit group, ingest: 0.00 GB, 0.00 MB/s
Cumulative WAL: 0 writes, 0 syncs, 0.00 writes per sync, written: 0.00 GB, 0.00 MB/s
Cumulative stall: 00:00:0.000 H:M:S, 0.0 percent
Interval writes: 0 writes, 0 keys, 0 commit groups, 0.0 writes per commit group, ingest: 0.00 MB, 0.00 MB/s
Interval WAL: 0 writes, 0 syncs, 0.00 writes per sync, written: 0.00 MB, 0.00 MB/s
Interval stall: 00:00:0.000 H:M:S, 0.0 percent

** Compaction Stats [3] **
Level    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
----------------------------------------------------------------------------------------------------------------------------------------------------------
  L0      0/0    0.00 KB   0.0      0.0     0.0      0.0       3.7      3.7       0.0   1.0      0.0    170.7        22        40    0.560       0      0
  L1      7/0   449.09 MB   0.9     13.2     3.8      9.4       9.8      0.3       0.0   2.6     90.2     66.6       150        20    7.505   3770K   365K
  L2     33/0    1.40 GB   0.3      1.7     0.4      1.3       1.3      0.0       0.0   3.6     21.3     16.7        82         5   16.306   3956K    42K
 Sum     40/0    1.84 GB   0.0     14.9     4.2     10.8      14.8      4.1       0.0   4.0     60.2     59.8       254        65    3.908   7726K   407K
 Int      0/0    0.00 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0      0.0         0         0    0.000       0      0
Uptime(secs): 42960.6 total, 495.2 interval
Flush(GB): cumulative 3.732, interval 0.000
AddFile(GB): cumulative 0.000, interval 0.000
AddFile(Total Files): cumulative 0, interval 0
AddFile(L0 Files): cumulative 0, interval 0
AddFile(Keys): cumulative 0, interval 0
Cumulative compaction: 14.83 GB write, 0.35 MB/s write, 14.92 GB read, 0.36 MB/s read, 254.0 seconds
Interval compaction: 0.00 GB write, 0.00 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.0 seconds
Stalls(count): 0 level0_slowdown, 0 level0_slowdown_with_compaction, 0 level0_numfiles, 0 level0_numfiles_with_compaction, 0 stop for pending_compaction_bytes, 0 slowdown for pending_compaction_bytes, 0 memtable_compaction, 0 memtable_slowdown, interval 0 total count

** File Read Latency Histogram By Level [3] **

** DB Stats **
Uptime(secs): 42960.6 total, 495.2 interval
Cumulative writes: 0 writes, 0 keys, 0 commit groups, 0.0 writes per commit group, ingest: 0.00 GB, 0.00 MB/s
Cumulative WAL: 0 writes, 0 syncs, 0.00 writes per sync, written: 0.00 GB, 0.00 MB/s
Cumulative stall: 00:00:0.000 H:M:S, 0.0 percent
Interval writes: 0 writes, 0 keys, 0 commit groups, 0.0 writes per commit group, ingest: 0.00 MB, 0.00 MB/s
Interval WAL: 0 writes, 0 syncs, 0.00 writes per sync, written: 0.00 MB, 0.00 MB/s
Interval stall: 00:00:0.000 H:M:S, 0.0 percent

** Compaction Stats [4] **
Level    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
----------------------------------------------------------------------------------------------------------------------------------------------------------
  L0      0/0    0.00 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   1.0      0.0      0.4         0         7    0.007       0      0
  L1      1/0    9.31 MB   0.0      0.0     0.0      0.0       0.0      0.0       0.0 1710.3     90.8     90.8         0         4    0.103    100K    341
 Sum      1/0    9.31 MB   0.0      0.0     0.0      0.0       0.0      0.0       0.0 1935.8     81.4     81.4         0        11    0.042    100K    341
 Int      0/0    0.00 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0      0.0         0         0    0.000       0      0
Uptime(secs): 42960.6 total, 495.2 interval
Flush(GB): cumulative 0.000, interval 0.000
AddFile(GB): cumulative 0.000, interval 0.000
AddFile(Total Files): cumulative 0, interval 0
AddFile(L0 Files): cumulative 0, interval 0
AddFile(Keys): cumulative 0, interval 0
Cumulative compaction: 0.04 GB write, 0.00 MB/s write, 0.04 GB read, 0.00 MB/s read, 0.5 seconds
Interval compaction: 0.00 GB write, 0.00 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.0 seconds
Stalls(count): 0 level0_slowdown, 0 level0_slowdown_with_compaction, 0 level0_numfiles, 0 level0_numfiles_with_compaction, 0 stop for pending_compaction_bytes, 0 slowdown for pending_compaction_bytes, 0 memtable_compaction, 0 memtable_slowdown, interval 0 total count

** File Read Latency Histogram By Level [4] **

** DB Stats **
Uptime(secs): 42960.6 total, 495.2 interval
Cumulative writes: 0 writes, 0 keys, 0 commit groups, 0.0 writes per commit group, ingest: 0.00 GB, 0.00 MB/s
Cumulative WAL: 0 writes, 0 syncs, 0.00 writes per sync, written: 0.00 GB, 0.00 MB/s
Cumulative stall: 00:00:0.000 H:M:S, 0.0 percent
Interval writes: 0 writes, 0 keys, 0 commit groups, 0.0 writes per commit group, ingest: 0.00 MB, 0.00 MB/s
Interval WAL: 0 writes, 0 syncs, 0.00 writes per sync, written: 0.00 MB, 0.00 MB/s
Interval stall: 00:00:0.000 H:M:S, 0.0 percent

# Clients
connected_clients:11
blocked_clients:0

# Persistence
loading:0
rdb_last_save_time:0
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:0
rdb_bgsave_in_progress:1
rdb_current_bgsave_time_sec:2987

# CPU
used_cpu_sys:1964.00
used_cpu_user:9258.05

# Replication
role:master
repl_dir: /opt/db/ardb/repl
repl_current_namespace:2
connected_slaves: 0
master_repl_offset: 74037807263962
repl_backlog_size: 2147483648
repl_backlog_cache_size: 3145728000
repl_backlog_first_byte_offset: 74035659780314
repl_backlog_histlen: 2147483649
repl_backlog_cksm: 6bb76b1432299806

# Memory
used_memory_rss:47218675712

# Stats
slave_sync_total_commands_processed:0
slave_sync_instantaneous_ops_per_sec:0
total_commands_processed:2212933
instantaneous_ops_per_sec:2
total_connections_received:613
rejected_connections:0
sync_full:0
sync_partial_ok:0
sync_partial_err:0
pubsub_channels:0
pubsub_patterns:0
expire_scan_keys:0

# Keyspace
db1:keys=2196
db2:keys=432
db3:keys=5755796
db4:keys=25231
git-hulk commented 3 years ago

It may cause by the RocksDB block cache or memory fragmentation, I'm not sure whether the ardb can set the block cache size or not(also cache_index_and_filter_blocks). https://github.com/bitleak/kvrocks also suffer the same issue before, but it's solved after using jemalloc and enabling cache_index_and_filter_blocks

Gourds commented 3 years ago

@git-hulk Thanks for your replay. This problem has disappeared after the upgrade