Closed OhadArzouan closed 1 year ago
I'm not aware of this issue or what might cause it. I'll need more specifics but I'm not even sure what to ask for.
I just reviewed the v1.5.1..v1.5.2 diff and don't see anything which would cause this. You don't say what you upgraded from.
Paste the contents of the /debug page, for a start.
Sorry for the late response, we upgraded from 1.4.2... We are running Faktory on an ECS instance with Fargate As for the /debug info:
Redis Info
# Server
redis_version:6.0.14
redis_git_sha1:ecf4164e
redis_git_dirty:0
redis_build_id:bd2e1423b53357c
redis_mode:standalone
os:Linux 4.14.238-182.422.amzn2.x86_64 x86_64
arch_bits:64
multiplexing_api:epoll
atomicvar_api:atomic-builtin
gcc_version:10.2.1
process_id:27
run_id:fd849b01dea162356b305e752b51fbf0b93f52c1
tcp_port:0
uptime_in_seconds:99011
uptime_in_days:1
hz:10
configured_hz:10
lru_clock:2316464
executable:/usr/bin/redis-server
config_file:/tmp/redis.conf
io_threads_active:0
# Clients
connected_clients:251
client_recent_max_input_buffer:32774
client_recent_max_output_buffer:16424
blocked_clients:0
tracking_clients:0
clients_in_timeout_table:0
# Memory
used_memory:423373026
used_memory_human:403.76M
used_memory_rss:578756608
used_memory_rss_human:551.95M
used_memory_peak:962746814
used_memory_peak_human:918.15M
used_memory_peak_perc:43.98%
used_memory_overhead:37320386
used_memory_startup:779576
used_memory_dataset:386052640
used_memory_dataset_perc:91.35%
allocator_allocated:423621248
allocator_active:578690048
allocator_resident:578690048
total_system_memory:32143994880
total_system_memory_human:29.94G
used_memory_lua:79872
used_memory_lua_human:78.00K
used_memory_scripts:2844
used_memory_scripts_human:2.78K
number_of_cached_scripts:4
maxmemory:0
maxmemory_human:0B
maxmemory_policy:noeviction
allocator_frag_ratio:1.37
allocator_frag_bytes:155068800
allocator_rss_ratio:1.00
allocator_rss_bytes:0
rss_overhead_ratio:1.00
rss_overhead_bytes:66560
mem_fragmentation_ratio:1.37
mem_fragmentation_bytes:155135360
mem_not_counted_for_evict:0
mem_replication_backlog:0
mem_clients_slaves:0
mem_clients_normal:4359374
mem_aof_buffer:0
mem_allocator:libc
active_defrag_running:0
lazyfree_pending_objects:0
# Persistence
loading:0
rdb_changes_since_last_save:161052
rdb_bgsave_in_progress:1
rdb_last_save_time:1629706377
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:19
rdb_current_bgsave_time_sec:8
rdb_last_cow_size:143691776
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_last_write_status:ok
aof_last_cow_size:0
module_fork_in_progress:0
module_fork_last_cow_size:0
# Stats
total_connections_received:13854
total_commands_processed:3094324895
instantaneous_ops_per_sec:41481
total_net_input_bytes:1623262490429
total_net_output_bytes:109466234743
instantaneous_input_kbps:21848.88
instantaneous_output_kbps:1141.99
rejected_connections:0
sync_full:0
sync_partial_ok:0
sync_partial_err:0
expired_keys:8305702
expired_stale_perc:12.43
expired_time_cap_reached_count:0
expire_cycle_cpu_milliseconds:49757
evicted_keys:0
keyspace_hits:102473087
keyspace_misses:1250248966
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:9353
migrate_cached_sockets:0
slave_expires_tracked_keys:0
active_defrag_hits:0
active_defrag_misses:0
active_defrag_key_hits:0
active_defrag_key_misses:0
tracking_total_keys:0
tracking_total_items:0
tracking_total_prefixes:0
unexpected_error_replies:0
total_reads_processed:1463976446
total_writes_processed:1462719079
io_threaded_reads_processed:0
io_threaded_writes_processed:0
# Replication
role:master
connected_slaves:0
master_replid:759c135676e96aa4afebea6686f758efa1e71f9f
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:0
second_repl_offset:-1
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
# CPU
used_cpu_sys:22453.085994
used_cpu_user:48718.990022
used_cpu_sys_children:1357.486476
used_cpu_user_children:9929.667411
# Modules
# Cluster
cluster_enabled:0
# Keyspace
db0:keys=240894,expires=240234,avg_ttl=126454244
Disk Usage
> df -h
Filesystem Size Used Available Use% Mounted on
overlay 29.4G 10.3G 17.5G 37% /
tmpfs 64.0M 0 64.0M 0% /dev
shm 15.0G 0 15.0G 0% /dev/shm
tmpfs 15.0G 0 15.0G 0% /sys/fs/cgroup
127.0.0.1:/ 8.0E 183.0M 8.0E 0% /var/lib/faktory
/dev/xvdcz 29.4G 10.3G 17.5G 37% /etc/hosts
/dev/xvdcz 29.4G 10.3G 17.5G 37% /etc/resolv.conf
/dev/xvdcz 29.4G 10.3G 17.5G 37% /etc/hostname
/dev/xvda1 4.9G 1.7G 3.1G 35% /managed-agents/execute-command
tmpfs 15.0G 0 15.0G 0% /proc/acpi
tmpfs 64.0M 0 64.0M 0% /proc/kcore
tmpfs 64.0M 0 64.0M 0% /proc/keys
tmpfs 64.0M 0 64.0M 0% /proc/latency_stats
tmpfs 64.0M 0 64.0M 0% /proc/timer_list
tmpfs 64.0M 0 64.0M 0% /proc/sched_debug
tmpfs 15.0G 0 15.0G 0% /sys/firmware
tmpfs 15.0G 0 15.0G 0% /proc/scsi
I see:
The only unusual metric I see is this:
instantaneous_input_kbps:21848.88
instantaneous_output_kbps:1141.99
That's a lot of input (20Mb/sec) and not a lot of output (1Mb/sec). I have to wonder why the network is so busy.
Have seen this bug too, using the latest version of faktory-ent, this bug is causing the container to shutdown and and spawns a new one, over and over, the setup is in fargate running as a sidecar container in a Task with the api container as main, both the api and faktory container share the EFS path /var/lib/faktory any insight to the cause of this ?
Faktory server 1.5.2
Are you using an old version? No Have you checked the changelogs to see if your issue has been fixed in a later version?
https://github.com/contribsys/faktory/blob/master/Changes.md https://github.com/contribsys/faktory/blob/master/Pro-Changes.md https://github.com/contribsys/faktory/blob/master/Ent-Changes.md