Snapchat / KeyDB

A Multithreaded Fork of Redis
https://keydb.dev
BSD 3-Clause "New" or "Revised" License
11.29k stars 570 forks source link

[BUG] keydb unixsocket is inaccesible when executed background save #410

Open rnz opened 2 years ago

rnz commented 2 years ago

Describe the bug

# redis-cli -s /proc/$(pgrep keydb-server)/root/tmp/keydb.sock info | grep ops
Could not connect to Redis at /proc/20715: Operation not permitted

To reproduce

enable unixsocket start keydb-server after start add workload - 90k ops run in one terminal session: $ watch -n0 "keydb-cli -s /proc/$(pgrep keydb-server)/root/tmp/keydb.sock info | grep ops" look at log in other terminal session: $ tail -f /var/log/keydb/keydb-server.log

Expected behavior

unixsocket is accesible when background save

Additional information

# apt-cache policy keydb-server
keydb-server:
  Installed: 6:6.2.2-1+deb10u1
  Candidate: 6:6.2.2-1+deb10u1
  Version table:
 *** 6:6.2.2-1+deb10u1 500
        500 https://download.keydb.dev/open-source-dist buster/main amd64 Packages
        100 /var/lib/dpkg/status

# lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux 10 (buster)
Release:    10
Codename:   buster

# df -h /tmp/
Filesystem               Size  Used Avail Use% Mounted on
pool1/subvol-336-disk-0   50G  3.8G   47G   8% /

# systemd-detect-virt
lxc

host info about zfs dataset:

# zfs get all pool1/subvol-336-disk-0 
NAME                     PROPERTY              VALUE                     SOURCE
pool1/subvol-336-disk-0  type                  filesystem                -
pool1/subvol-336-disk-0  creation              Tue Oct 13 16:30 2020     -
pool1/subvol-336-disk-0  used                  3.76G                     -
pool1/subvol-336-disk-0  available             46.2G                     -
pool1/subvol-336-disk-0  referenced            3.76G                     -
pool1/subvol-336-disk-0  compressratio         1.71x                     -
pool1/subvol-336-disk-0  mounted               yes                       -
pool1/subvol-336-disk-0  quota                 none                      default
pool1/subvol-336-disk-0  reservation           none                      default
pool1/subvol-336-disk-0  recordsize            128K                      default
pool1/subvol-336-disk-0  mountpoint            /pool1/subvol-336-disk-0  default
pool1/subvol-336-disk-0  sharenfs              off                       default
pool1/subvol-336-disk-0  checksum              on                        default
pool1/subvol-336-disk-0  compression           on                        inherited from pool1
pool1/subvol-336-disk-0  atime                 on                        default
pool1/subvol-336-disk-0  devices               on                        default
pool1/subvol-336-disk-0  exec                  on                        default
pool1/subvol-336-disk-0  setuid                on                        default
pool1/subvol-336-disk-0  readonly              off                       default
pool1/subvol-336-disk-0  zoned                 off                       default
pool1/subvol-336-disk-0  snapdir               hidden                    default
pool1/subvol-336-disk-0  aclinherit            restricted                default
pool1/subvol-336-disk-0  createtxg             3994796                   -
pool1/subvol-336-disk-0  canmount              on                        default
pool1/subvol-336-disk-0  xattr                 sa                        received
pool1/subvol-336-disk-0  copies                1                         default
pool1/subvol-336-disk-0  version               5                         -
pool1/subvol-336-disk-0  utf8only              off                       -
pool1/subvol-336-disk-0  normalization         none                      -
pool1/subvol-336-disk-0  casesensitivity       sensitive                 -
pool1/subvol-336-disk-0  vscan                 off                       default
pool1/subvol-336-disk-0  nbmand                off                       default
pool1/subvol-336-disk-0  sharesmb              off                       default
pool1/subvol-336-disk-0  refquota              50G                       local
pool1/subvol-336-disk-0  refreservation        none                      default
pool1/subvol-336-disk-0  guid                  1725799109246666176       -
pool1/subvol-336-disk-0  primarycache          all                       default
pool1/subvol-336-disk-0  secondarycache        all                       default
pool1/subvol-336-disk-0  usedbysnapshots       0B                        -
pool1/subvol-336-disk-0  usedbydataset         3.76G                     -
pool1/subvol-336-disk-0  usedbychildren        0B                        -
pool1/subvol-336-disk-0  usedbyrefreservation  0B                        -
pool1/subvol-336-disk-0  logbias               latency                   default
pool1/subvol-336-disk-0  dedup                 off                       default
pool1/subvol-336-disk-0  mlslabel              none                      default
pool1/subvol-336-disk-0  sync                  standard                  default
pool1/subvol-336-disk-0  dnodesize             legacy                    default
pool1/subvol-336-disk-0  refcompressratio      1.71x                     -
pool1/subvol-336-disk-0  written               3.76G                     -
pool1/subvol-336-disk-0  logicalused           6.43G                     -
pool1/subvol-336-disk-0  logicalreferenced     6.43G                     -
pool1/subvol-336-disk-0  volmode               default                   default
pool1/subvol-336-disk-0  filesystem_limit      none                      default
pool1/subvol-336-disk-0  snapshot_limit        none                      default
pool1/subvol-336-disk-0  filesystem_count      none                      default
pool1/subvol-336-disk-0  snapshot_count        none                      default
pool1/subvol-336-disk-0  snapdev               hidden                    default
pool1/subvol-336-disk-0  acltype               posixacl                  received
pool1/subvol-336-disk-0  context               none                      default
pool1/subvol-336-disk-0  fscontext             none                      default
pool1/subvol-336-disk-0  defcontext            none                      default
pool1/subvol-336-disk-0  rootcontext           none                      default
pool1/subvol-336-disk-0  relatime              off                       default
pool1/subvol-336-disk-0  redundant_metadata    all                       default
pool1/subvol-336-disk-0  overlay               off                       default
# keydb-cli info
# Server
redis_version:6.2.2
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:378dfe5fe410f83
redis_mode:standalone
os:Linux 4.15.18-30-pve x86_64
arch_bits:64
multiplexing_api:epoll
atomicvar_api:atomic-builtin
gcc_version:8.3.0
process_id:20715
process_supervised:no
run_id:78b218ea8e9bed96face2016e8014355a6e384ed
tcp_port:6379
server_time_usec:1649691797579483
uptime_in_seconds:7613
uptime_in_days:0
hz:10
configured_hz:10
lru_clock:5524629
executable:/usr/bin/keydb-server
config_file:/etc/keydb/keydb.conf

# Clients
connected_clients:81
cluster_connections:0
maxclients:10000
client_recent_max_input_buffer:120
client_recent_max_output_buffer:0
blocked_clients:0
tracking_clients:0
clients_in_timeout_table:0
current_client_thread:4
thread_0_clients:20
thread_1_clients:20
thread_2_clients:20
thread_3_clients:20
thread_4_clients:1
thread_5_clients:0
thread_6_clients:0
thread_7_clients:0
thread_8_clients:0
thread_9_clients:0

# Memory
used_memory:681511512
used_memory_human:649.94M
used_memory_rss:763887616
used_memory_rss_human:728.50M
used_memory_peak:756577064
used_memory_peak_human:721.53M
used_memory_peak_perc:90.08%
used_memory_overhead:377523520
used_memory_startup:6475408
used_memory_dataset:303987992
used_memory_dataset_perc:45.03%
allocator_allocated:681959808
allocator_active:742588416
allocator_resident:782413824
total_system_memory:2160283832320
total_system_memory_human:1.96T
used_memory_lua:37888
used_memory_lua_human:37.00K
used_memory_scripts:0
used_memory_scripts_human:0B
number_of_cached_scripts:0
maxmemory:0
maxmemory_human:0B
maxmemory_policy:noeviction
allocator_frag_ratio:1.09
allocator_frag_bytes:60628608
allocator_rss_ratio:1.05
allocator_rss_bytes:39825408
rss_overhead_ratio:0.98
rss_overhead_bytes:-18526208
mem_fragmentation_ratio:1.12
mem_fragmentation_bytes:82543808
mem_not_counted_for_evict:0
mem_replication_backlog:0
mem_clients_slaves:0
mem_clients_normal:1641248
mem_aof_buffer:0
mem_allocator:jemalloc-5.2.1
active_defrag_running:0
lazyfree_pending_objects:0
lazyfreed_objects:0

# Persistence
loading:0
current_cow_size:0
current_cow_size_age:0
current_fork_perc:0.00
current_save_keys_processed:0
current_save_keys_total:0
rdb_changes_since_last_save:889824
rdb_bgsave_in_progress:0
rdb_last_save_time:1649691787
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:17
rdb_current_bgsave_time_sec:-1
rdb_last_cow_size:388313088
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_last_write_status:ok
aof_last_cow_size:0
module_fork_in_progress:0
module_fork_last_cow_size:0

# Stats
total_connections_received:3699
total_commands_processed:379407833
instantaneous_ops_per_sec:44870
total_net_input_bytes:29492715297
total_net_output_bytes:1659930492
instantaneous_input_kbps:3414.79
instantaneous_output_kbps:226.61
rejected_connections:0
sync_full:0
sync_partial_ok:0
sync_partial_err:0
expired_keys:121404234
expired_stale_perc:0.00
expired_time_cap_reached_count:46769
expire_cycle_cpu_milliseconds:0
evicted_keys:0
keyspace_hits:393002
keyspace_misses:126337161
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:36296
total_forks:101
migrate_cached_sockets:0
slave_expires_tracked_keys:0
active_defrag_hits:0
active_defrag_misses:0
active_defrag_key_hits:0
active_defrag_key_misses:0
tracking_total_keys:0
tracking_total_items:0
tracking_total_prefixes:0
unexpected_error_replies:0
total_error_replies:1
dump_payload_sanitizations:0
total_reads_processed:379411457
total_writes_processed:379407834

# Replication
role:master
connected_slaves:0
master_failover_state:no-failover
master_replid:d21e55cdcbff978781e7c05406eeea1494fa647a
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:0
second_repl_offset:-1
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

# CPU
used_cpu_sys:10592.363177
used_cpu_user:6511.135160
used_cpu_sys_children:172.739708
used_cpu_user_children:1074.894260
server_threads:10
long_lock_waits:267804
used_cpu_sys_main_thread:2.146001
used_cpu_user_main_thread:91.540064

# Modules

# Errorstats
errorstat_ERR:count=1

# Cluster
cluster_enabled:0

# Keyspace
db0:keys=4932697,expires=4932697,avg_ttl=4
MalavanEQAlpha commented 2 years ago

Hi @rnz I could not reproduce this bug. Based on this error message you provided "Could not connect to Redis at /proc/20715: Operation not permitted", I believe it may be a permission issue, have you tried running "sudo redis-cli -s /proc/$(pgrep keydb-server)/root/tmp/keydb.sock info | grep ops"?

rnz commented 2 years ago

@MalavanEQAlpha may be load is not havy?

benschermel commented 2 years ago

@rnz have you tried looking at permissions per Malavan's suggestion under your specific load?

rnz commented 2 years ago

@MalavanEQAlpha, @benschermel i think that is not permission issue. That message "Could not connect to Redis at /proc/20715: Operation not permitted" appears beetwen loops the command INFO return result.