Closed tamalsaha closed 7 years ago
Does restic have restic.conf
?
@sanmai , I meant restic.conf as a "placeholder'. We need some way take the config from Kubernetes and forward that to restic cli.
There are a number of open issues regarding the TPR spec:
Should we use a single TPR for all 4 parts of the information?
Do we move the destination (restic repository info) to its own object so that multiple installation of an application in a cluster can share the rest.
Since we are associating restic TPR with RCs, etc via label, N:1 association is possible between Rcs and restic TPR. Multiple application should not use the same restic TPR, unless they are all part of a same Deployment. Do we use something like OwnerRef
to detect these types of cases?
Will it be possible to backup different paths of a persistent volume at a different schedule?
retention_policy:
strategy: KEEY_HOURLY
snapshot_count: 5
retain_hostname: myhost
retain_tag: ["abc", "xyz"]
Update operation will support work like below:
restik
controller. restik
will add that as an annotation to TPR when side card containers are added. If these annotations are modified, RC side car will be modified. Later, we can add a reconcile flag to TPR controller to force upgrade for all TPRs (this is probably unsafe), as long as restic is backward compatible.@sadlil @saumanbiswas we need to discuss the secret format. The secret will have the following things:
We should also decide, whether to mount the secret / set ENV vars (seems the format restic likes) vs reading via kube api in side car container. Though go exec
processes don't see the env variables in the parent process. Mount / ENV variable will mean that we can only use secrets from the same namespace. In that case we can call it repositorySecret
instead of repositorySecretName
.
It's implemented now!
Motivation:
If you are running production workloads in Kubernetes, you want to take backup of your disks. Traditional tools like Bacula are too complex to setup and maintain manually in a dynamic compute environment like Kubernetes. So, we plan to implement a TPR controller based on restic to address these issues.
Goals:
Non-goals:
Why Restic
Design:
Taking Backups:
To take a backup 4 piece of information is required:
This a how a TPR controller can implement backups:
Filtering for a custom controller (one running its own, singular list/watch) will reduce deserialization costs. So, restik TOR will watch for RCs, etc. with label
storage.appscode.com/backup: true
.I think the label based option has some advantages:
Once TPR controller finds RC etc that has enabled bckup, it will add a sidecar container with restic. Currently there is no way to add sidecard container to a running pod. So, restic will restart the pods for the first time. This is not ideal but this the best we can do at this time.
If the RC and TPR assocation is later removed, TPR controller will also remove the side car container.
Scheuling backups:
I think we can just use cron to run backups on a schedule inside the docker container. This sounds simple enough.
Entrypoint:
Since restic process will be run on a scheule, some process will be needed to be running as the entrypoint. This will be a kloader type process that watches restic TPR and translates that into the restic compatiable config. eg,
Essentially, we can create a new
restik
(with k) cli with 3 commands:Supported Kubernetes Objects:
We should any Kubernetes object types that have PodTemplate in their schema. These are:
Since new side car containers can't added once a Pod is created, Pod types will not be supported.
Backup Nodes
Users might be interested in take backup of host paths. This can be done by deploying a DaemonSet with a do nothing busybox container. Restic TPR controller can use that as a vessel for running restic side card containers.
Restarting pods
As mentioned before, first time side car containers are added, pods will be restarted by controller. Who performs the restart will be done on a case-by-case basis. For example, Kubernetes itself will restarts pods behind a deployment. In such cases, TPR controller will let Kubernetes do that.
Implementers should be aware of the controller-ref concept to properly identify the pods managed by a RC, etc.
This situation can be improved once Kubernetes adds support for extensible admission controller. Restic TPRC can become an admission controller and modify the PodTemplate accordingly. This is avoid the first restart issue. This will also allow us to support Pods directly. See tracking bug #5.
Repo initialization
For initial implementation, users will be expected to initialize the repository. We can revisit this later. See tracking bug #6 .
HTTP api
An HTTP api server can be implemented to expose readonly data for repositories. This can be used to build web Dashboards. This is outside the scope for initial implementation. Tracking bug #7.
Recovery process
The recover process will be left to users for now. Later we can think about automation.
CLI
A cli can be created that can simplify repo initialization, recover process for a Kubernetes deployment. This is outside the scope for initial implementation. Also, note that we are building a cli called osm for managing various cloud bucket operations. Tracking bug #8