Open air3ijai opened 1 year ago
Got the issue today on the Dev after a lot of issues we experienced yesterday in the Kubernetes cluster
Dumps
drwxr-xr-x. 2 root root 102 Jan 11 15:38 .
drwxr-xr-x. 1 root root 56 Jan 10 14:53 ..
-rw-r--r--. 1 root root 20553313533 Jan 10 07:13 dump.rdb
-rw-r--r--. 1 root root 2190929920 Jan 10 07:43 temp--1701050988.1.rdb
-rw-r--r--. 1 root root 1806061732 Jan 11 15:38 temp-324797-0.rdb
-rw-r--r--. 1 root root 2700424307 Jan 10 07:43 temp-652292-0.rdb
Save error loop
1:319:S 11 Jan 2023 15:35:47.933 * Replica 192.168.10.10:6379 asks for synchronization
1:319:S 11 Jan 2023 15:35:47.933 * Full resync requested by replica 192.168.10.10:6379
1:319:S 11 Jan 2023 15:35:47.933 * Starting BGSAVE for SYNC with target: disk
1:319:S 11 Jan 2023 15:35:48.105 * Background saving started by pid 324179
1:319:S 11 Jan 2023 15:35:48.105 * Background saving started
324179:319:C 11 Jan 2023 15:38:35.454 # Write error saving DB on disk: No space left on device
1:319:S 11 Jan 2023 15:38:36.601 # Background saving error
1:319:S 11 Jan 2023 15:38:36.601 # SYNC failed. BGSAVE child returned an error
1:319:S 11 Jan 2023 15:38:36.601 # Connection with replica 192.168.10.10:6379 lost.
1:319:S 11 Jan 2023 15:38:36.783 * Replica 192.168.10.10:6379 asks for synchronization
1:319:S 11 Jan 2023 15:38:36.783 * Full resync requested by replica 192.168.10.10:6379
1:319:S 11 Jan 2023 15:38:36.783 * Starting BGSAVE for SYNC with target: disk
1:319:S 11 Jan 2023 15:38:36.956 * Background saving started by pid 324797
1:319:S 11 Jan 2023 15:38:36.956 * Background saving started
324797:319:C 11 Jan 2023 15:41:19.609 # Write error saving DB on disk: No space left on device
1:319:S 11 Jan 2023 15:41:20.887 # Background saving error
1:319:S 11 Jan 2023 15:41:20.887 # SYNC failed. BGSAVE child returned an error
1:319:S 11 Jan 2023 15:41:20.887 # Connection with replica 192.168.10.10:6379 lost.
Hello,
We just did a test how the Pod will handle multiple restarts during backups.
Doing this in a loop, we may running out of disk space. It is for sure a corner case.
Current value for the cleanupTempfiles.minutes is
60 minutes
and it will not delete all precedent crashes happened just some minutes ago.What is the main reason to have such a big value?
For Bitnami Redis Chart we use the following
So, we will delete all temporary files right before the Redis start.