projectsyn / component-cluster-backup

This is a Commodore Component for managing backups of a cluster
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Make k8up CRDs resource configurable #14

Open zugao opened 3 years ago

zugao commented 3 years ago

Context

Some clusters require more memory to run cluster backups. Here's an example of backup running out of memory (137 error):

E0805 10:04:12.944025 1 pod_exec.go:76] wrestic/k8sExec "msg"="streaming data failed" "error"="command terminated with exit code 137" "namespace"="syn-cluster-backup" "pod"="object-dumper-847b96f5bb-bpc6c"

Alternatives

We could somehow force the backup pod use less memory but having the resources configurable would be a better solution.

simu commented 3 years ago

The prebackuppod is created without resource requests or limits (cf. https://github.com/projectsyn/component-backup-k8up/blob/ef37339b1267ed466b7ac90dd1f7fbec57e38c52/lib/backup-k8up.libjsonnet#L197-L228). So, if the object dumper is terminated with exit code 137 (which can be a sign of the process running out of memory, but doesn't have to be, 137 is just the exit code when a process is terminated with SIGKILL), it's not caused by the requests or limits of the prebackuppod, but for some other reason (e.g. the node running out of memory).

Without having an example to investigate, it's unlikely that we can identify the cause. At the moment, I suspect that the object dumper tool uses too much memory perform the by-namespace splitting of the JSON files for the backup.