dragonflydb / dragonfly

A modern replacement for Redis and Memcached
https://www.dragonflydb.io/
Other
25.67k stars 944 forks source link

Snapshotting in Epoll blocks Dragonfly threads #3895

Closed boomballa closed 2 weeks ago

boomballa commented 2 weeks ago

Describe the bug

2642 #2702

Saw that someone had mentioned a related issue before, and there was a related tighter, but it still seems to be an issue at this point.

To Reproduce

  1. Example execution bgsave
  2. Then all client access behaviors are blocked, such as info, set can not return the success flag properly.
  3. the OK result is returned until bgsave has finished.

Expected behavior

  1. The bgsave process must not block client access, as this would interrupt business operations and may affect the ability of the monitoring system to incorrectly interpret the state of the instance.
  2. Is it done in an asynchronous manner? It looks like df doesn't support it yet.

Screenshots image image image

> bgsave
OK
> set dba 1

(Here I hit enter a couple times to make a blocking interval distinction)

OK
(14.42s)

> bgsave async
(error) ERR Unknown subcommand or wrong number of arguments for 'ASYNC'. Try SAVE HELP.

Environment (please complete the following information):

Additional context This situation should still be serious, it will directly affect the access to the client instance that has a backup plan, when the bgsave blocked all access, including the metrics data retrieval. I think this situation is not reasonable if put into the production environment. If the amount of data to follow more and more large, the impact of time will become longer, which will affect the business of data access for a long time. For Online to provide services to the business, I think this is unacceptable. I hope it can be fixed soon. Please let me know if I'm missing something or if I'm not doing it right, thank you.

romange commented 2 weeks ago

@boomballa Can you please run redis-cli -3 DEBUG OBJHIST and paste here the output?

boomballa commented 2 weeks ago

@boomballa Can you please run redis-cli -3 DEBUG OBJHIST and paste here the output?

@romange Sure, no problem. The output message is as follows:

___begin object histogram___

OBJECT:string
________________________________________________________________
Key length histogram:
Count: 1864002 Average: 62.3986  StdDev: 401.35
Min: 3.0000  Median: 70.0000  Max: 70.0000
------------------------------------------------------
[       3,       4 ) 1   0.000%   0.000%
[       7,       8 ) 1   0.000%   0.000%
[      35,      40 ) 372972  20.009%  20.009% ####
[      60,      70 ) 111486   5.981%  25.990% #
[      70,      80 ) 1379542  74.010% 100.000% ###############

Value length histogram:
Count: 1864002 Average: 1322.8172  StdDev: 88451.13
Min: 0.0000  Median: 763.7133  Max: 24576.0000
------------------------------------------------------
[       0,       1 ) 2   0.000%   0.000%
[     350,     400 ) 5620   0.302%   0.302%
[     700,     800 ) 1453979  78.003%  78.305% ################
[    1400,    1600 ) 372901  20.005%  98.310% ####
[   20000,   25000 ) 31500   1.690% 100.000%

Cardinality histogram:
Count: 0 Average: 0.0000  StdDev: 0.00
Min: 0.0000  Median: 0.0000  Max: 0.0000
------------------------------------------------------

Entry length histogram:
Count: 0 Average: 0.0000  StdDev: 0.00
Min: 0.0000  Median: 0.0000  Max: 0.0000
------------------------------------------------------

___end object histogram___
romange commented 2 weeks ago

The values are not very big, so it should not be like this. How many threads do you use? can you please paste the output of info all command?

boomballa commented 2 weeks ago

Hi @romange , This is just a test environment for me, there is not much data, I just use the script to pour in a little bit of data, at the moment it looks like it should be around 3Gib, I don't use the parameter to limit the use of the cpu for df, at the moment the number of cores on the physical machine is 40core.

The information for info all is as follows:

> info all
# Server
redis_version:6.2.11
dragonfly_version:df-v1.23.0
redis_mode:cluster
arch_bits:64
os:Linux 5.10.0-30-amd64 x86_64
multiplexing_api:epoll
tcp_port:7379
thread_count:40
uptime_in_seconds:112744
uptime_in_days:1

# Clients
connected_clients:82
max_clients:64000
client_read_buffer_bytes:22016
blocked_clients:0
pipeline_queue_length:0

# Memory
used_memory:3717324160
used_memory_human:3.46GiB
used_memory_peak:3717324160
used_memory_peak_human:3.46GiB
fibers_stack_vms:8122496
fibers_count:248
used_memory_rss:4296523776
used_memory_rss_human:4.00GiB
used_memory_peak_rss:4301910016
maxmemory:10737418240
maxmemory_human:10.00GiB
used_memory_lua:0
object_used_memory:3517223680
type_used_memory_string:3517223680
table_used_memory:195481600
num_buckets:307200
num_entries:2546802
inline_keys:2
listpack_blobs:0
listpack_bytes:0
small_string_bytes:144249472
pipeline_cache_bytes:0
dispatch_queue_bytes:0
dispatch_queue_subscriber_bytes:0
dispatch_queue_peak_bytes:5544
client_read_buffer_peak_bytes:54784
tls_bytes:5664
snapshot_serialization_bytes:0
cache_mode:cache
maxmemory_policy:eviction
replication_streaming_buffer_bytes:0
replication_full_sync_buffer_bytes:0

# Stats
total_connections_received:13271
total_commands_processed:5211731
instantaneous_ops_per_sec:40
total_pipelined_commands:682898
total_pipelined_squashed_commands:682865
pipelined_latency_usec:687240385
total_net_input_bytes:1091919453
connection_migrations:0
total_net_output_bytes:18427635
rdb_save_usec:131921485
rdb_save_count:347630
instantaneous_input_kbps:-1
instantaneous_output_kbps:-1
rejected_connections:-1
expired_keys:0
evicted_keys:0
hard_evictions:0
garbage_checked:7293
garbage_collected:0
bump_ups:10
stash_unloaded:0
oom_rejections:0
traverse_ttl_sec:31482
delete_ttl_sec:0
keyspace_hits:0
keyspace_misses:0
keyspace_mutations:3231488
total_reads_processed:4548359
total_writes_processed:79130
defrag_attempt_total:0
defrag_realloc_total:0
defrag_task_invocation_total:0
reply_count:79130
reply_latency_usec:0
blocked_on_interpreter:0
lua_interpreter_cnt:0
lua_blocked:0

# Tiered
tiered_entries:0
tiered_entries_bytes:0
tiered_total_stashes:0
tiered_total_fetches:0
tiered_total_cancels:0
tiered_total_deletes:0
tiered_total_uploads:0
tiered_total_stash_overflows:0
tiered_heap_buf_allocations:0
tiered_registered_buf_allocations:0
tiered_allocated_bytes:0
tiered_capacity_bytes:0
tiered_pending_read_cnt:0
tiered_pending_stash_cnt:0
tiered_small_bins_cnt:0
tiered_small_bins_entries_cnt:0
tiered_small_bins_filling_bytes:0
tiered_cold_storage_bytes:0
tiered_offloading_steps:0
tiered_offloading_stashes:0
tiered_ram_hits:0
tiered_ram_cool_hits:0
tiered_ram_misses:0

# Persistence
current_snapshot_perc:0
current_save_keys_processed:0
current_save_keys_total:0
last_success_save:1728462519
last_saved_file:/home/dba/dragonfly/dragonfly7379/dragonfly7379-dump-summary.dfs
last_success_save_duration_sec:19
loading:0
saving:0
current_save_duration_sec:0
rdb_string:1864002
rdb_changes_since_last_success_save:682925
last_failed_save:0
last_error:
last_failed_save_duration_sec:0

# Transaction
tx_shard_polls:360
tx_shard_optimistic_total:684687
tx_shard_ooo_total:0
tx_global_total:7
tx_normal_total:684687
tx_inline_runs_total:0
tx_schedule_cancel_total:0
tx_with_freq:684687,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,7
tx_queue_len:0
eval_io_coordination_total:0
eval_shardlocal_coordination_total:0
eval_squashed_flushes:0
multi_squash_execution_total:6828
multi_squash_execution_hop_usec:3568325
multi_squash_execution_reply_usec:0

# Replication
role:master
connected_slaves:1
slave0:ip=xx.xx.xx.xx,port=6380,state=online,lag=0
master_replid:969a84fd104c72e6835e109d098337959107f377

# Commandstats
cmdstat_auth:calls=13271,usec=309022,usec_per_call=23.2855
cmdstat_bgsave:calls=7,usec=8523,usec_per_call=1217.57
cmdstat_client:calls=2,usec=18,usec_per_call=9
cmdstat_command:calls=7,usec=83561,usec_per_call=11937.3
cmdstat_config:calls=7556,usec=149375,usec_per_call=19.7691
cmdstat_debug:calls=1,usec=16885,usec_per_call=16885
cmdstat_dfly:calls=42,usec=12831,usec_per_call=305.5
cmdstat_hello:calls=1,usec=84,usec_per_call=84
cmdstat_info:calls=3607,usec=3662209,usec_per_call=1015.31
cmdstat_ping:calls=1,usec=7,usec_per_call=7
cmdstat_replconf:calls=4500662,usec=25199,usec_per_call=0.00559895
cmdstat_set:calls=684687,usec=17846986,usec_per_call=26.0659
cmdstat_slowlog:calls=1886,usec=555063,usec_per_call=294.307
unknown_BGSABE:1

# Modules
module:name=ReJSON,ver=20000,api=1,filters=0,usedby=[search],using=[],options=[handle-io-errors]
module:name=search,ver=20000,api=1,filters=0,usedby=[],using=[ReJSON],options=[handle-io-errors]

# Search
search_memory:0
search_num_indices:0
search_num_entries:0

# Errorstats
-LOADING Dragonfly is loading the dataset in memory:3
unknown_cmd:1
syntax_error:2

# Keyspace
db0:keys=2546802,expires=682800,avg_ttl=-1

# Cpu
used_cpu_sys:1420.246711
used_cpu_user:2209.15216
used_cpu_sys_children:0.0
used_cpu_user_children:0.0
used_cpu_sys_main_thread:34.131021
used_cpu_user_main_thread:54.546016

# Cluster
cluster_enabled:1
romange commented 2 weeks ago

is this path /home/dba/dragonfly/dragonfly7379/ mounted on an ssd disk?

Is there a chance you could update your kernel to have iouring? i.e. use debian 12 or ubuntu 24.04 for example?

We have a totally differrent codepath when writing files with epoll, maybe this is related.

boomballa commented 2 weeks ago

Is there a chance you could update your kernel to have iouring?

/home/dba/dragonfly/dragonfly7379/ This path is currently not an SSD, SAS disk, writing almost 30 to 50 Mib capacit.Since this was a test environment, there was no intentional replacement of hardware, just a test of functionality.

i.e. use debian 12 or ubuntu 24.04 for example?

Basically not in a short period of time, because every time the system is replaced, many systems within the company have to be changed, requiring many people to cooperate with the change of many systems to adapt. Generally it is the need for version stabilization or urgent upgrade needs to cross some system security vulnerabilities and even bugs, will be forced to upgrade.

romange commented 2 weeks ago

Yeah, 30-50mb is the reason.the server is blocked on i/o. Were you sending write traffic during BG save?

boomballa commented 2 weeks ago

Yeah, 30-50mb is the reason.the server is blocked on i/o. Were you sending write traffic during BG save?

@romange This is just a test environment, no traffic is actually written during bgsave. But in a normal production environment, surely the write traffic is constant and a full bgsave is needed for data synchronization support when building a slave, or anytime the slave goes down and needs to be rebuilt. According to what you are saying if the data directory I bgsave to is SSD does this problem not happen? Normal SSD raid10 write capacity should be 400M-500M, so this will solve the IO blocking problem? Or is it just a shorter blocking time? What I understand is that if the amount of data increases, for example 500G, even if the IO capacity is sufficient, the blocking mechanism of bgsave will still have this problem. I'm not sure if I'm understanding this correctly, and would love to hear your insights.

romange commented 2 weeks ago

@boomballa I would like to reproduce this scenario. Dragonfly should not block for 10seconds in any case. Can you please DM me on discord?

boomballa commented 2 weeks ago

@boomballa I would like to reproduce this scenario. Dragonfly should not block for 10seconds in any case. Can you please DM me on discord?

OK 👌