Snapchat / KeyDB

A Multithreaded Fork of Redis
BSD 3-Clause "New" or "Revised" License
11.34k stars 572 forks source link

[BUG] Incorrect memory usage in multi-master mode #844

Open Danozavr opened 3 months ago

Danozavr commented 3 months ago

During the operation of KeyDB in multi-master mode, a failure occurs where memory consumption increases significantly and network traffic grows. When multi-master-no-forward: yes is enabled, the traffic in normal mode decreases, but during the failure, it also increases. I have noticed that this behavior occurs more frequently if the connection is lost for some time.

To reproduce

  1. Deploy replicas in multi-master mode (in my case, 6 replicas).
  2. Populate with approximately 4GB of data.
  3. Disconnect 1 replica and make data changes on the working replicas.
  4. Restore the disconnected replica.
  5. Wait."

Expected behavior

Expected behavior: data synchronizes. Actual behavior: a race condition starts with a significant increase in memory consumption.

Additional information

keydb.conf: protected-mode no port 6379 tcp-backlog 511 timeout 0 tcp-keepalive 300 supervised no pidfile /var/run/ loglevel notice databases 16 always-show-logo yes set-proc-title yes proc-title-template "{title} {listen-addr} {server-mode}" save 900 1 save 300 10 save 60 10000 stop-writes-on-bgsave-error yes rdbcompression yes rdbchecksum yes dbfilename dump.rdb rdb-del-sync-files no dir /data replica-serve-stale-data yes replica-read-only yes repl-diskless-sync no repl-diskless-sync-delay 5 repl-diskless-load disabled repl-disable-tcp-nodelay no replica-priority 100 acllog-max-len 128 lazyfree-lazy-eviction no lazyfree-lazy-expire no lazyfree-lazy-server-del no replica-lazy-flush no lazyfree-lazy-user-del no lazyfree-lazy-user-flush no oom-score-adj no oom-score-adj-values 0 200 800 disable-thp yes appendonly no appendfilename "appendonly.aof" appendfsync everysec no-appendfsync-on-rewrite no auto-aof-rewrite-percentage 100 auto-aof-rewrite-min-size 64mb aof-load-truncated yes aof-use-rdb-preamble yes lua-time-limit 5000 slowlog-log-slower-than 10000 slowlog-max-len 128 latency-monitor-threshold 0 notify-keyspace-events "" hash-max-ziplist-entries 512 hash-max-ziplist-value 64 list-max-ziplist-size -2 list-compress-depth 0 set-max-intset-entries 512 zset-max-ziplist-entries 128 zset-max-ziplist-value 64 hll-sparse-max-bytes 3000 stream-node-max-bytes 4096 stream-node-max-entries 100 activerehashing yes client-output-buffer-limit normal 0 0 0 client-output-buffer-limit replica 256mb 64mb 60 client-output-buffer-limit pubsub 32mb 8mb 60 hz 10 dynamic-hz yes aof-rewrite-incremental-fsync yes rdb-save-incremental-fsync yes jemalloc-bg-thread yes server-threads 2 replica-weighting-factor 2 active-client-balancing yes

I also overwrite some parameters using launch attributes: --active-replica yes -- maxmemory 4gb -- maxmemory-policy allkeys-lru -- multi-master-no-forward yes -- client-output-buffer-limit normal 0 0 0 -- client-output-buffer-limit replica 0 0 0 image image