stashed / stash

🛅 Backup your Kubernetes Stateful Applications
https://stash.run
Other
1.3k stars 86 forks source link

Backup/restore database (KubeDB integration) #388

Closed pierreozoux closed 5 years ago

pierreozoux commented 6 years ago

How do you backup such volumes? The volume in itself is not really interesting, we need to dump data first.

What about puting a label on such volumes, and do a dump instead?

galexrt commented 6 years ago

Even though it is a pretty "dirty" hack, you can do something like that:

apiVersion: v1
data:
  run.sh: |+
    #!/bin/bash

    echo "*:$PG_PORT:*:$PG_USER:$PG_PASSWORD" > /root/.pgpass
    chmod 600 /root/.pgpass

    echo "Running loop ..."
    while true; do
        rm /backups/*
        echo "Dumping database ..."
        /usr/bin/pg_dumpall \
            -U $PG_USER \
            -h $PG_HOST \
            -p $PG_PORT \
            --inserts \
            -f "/backups/$(date +"%s").sql"
        echo "Dump completed. Code: $?"
        echo "Sleeping 4h ..."
        sleep 4h
    done
kind: ConfigMap
metadata:
  labels:
    app: pgdump
  name: pgdump-cm
---
apiVersion: v1
data:
  PG_HOST: "your-postgres-host.com"
  PG_PORT: "5432"
  PG_USER: "postgres"
kind: ConfigMap
metadata:
  labels:
    app: pgdump
  name: pgdump-env
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: pgdump-sinfra
  labels:
    app: pgdump-sinfra
spec:
  replicas: 1
  selector:
    matchLabels:
      app: pgdump-sinfra
  template:
    metadata:
      name: pgdump-sinfra
      labels:
        app: pgdump-sinfra
    spec:
      serviceAccountName: stash-sidecar
      volumes:
      - name: pgdump-backups
        emptyDir: {}
      - name: scripts
        projected:
          sources:
          - configMap:
              name: pgdump-cm
              items:
              - key: run.sh
                path: run.sh
                mode: 0755
      containers:
      - image: postgres:9.6.6
        name: backup
        envFrom:
        - configMapRef:
            name: pgdump-env
        - secretRef:
            name: pgdump-postgres-password
        volumeMounts:
        - name: pgdump-backups
          mountPath: /backups
        - name: scripts
          mountPath: /scripts
        command:
        - /scripts/run.sh

Using pghoard would be a possibility, see https://github.com/ohmu/pghoard.

pierreozoux commented 6 years ago

Thanks @galexrt for your kind answer, maybe my question was not really clear. More than the implementation (there is also this PR opened for mysql ) I'm more interested in a systematic solution. How do we dump database data before backup?

This is related to this question, how do we do automatic backup, and as this repository solves this in an elegant manner, the next question is, how do we do dumps before the backups, and how do we standardize this procedure.

This is highly hypothetic at this stage, but here is an example workflow:

Probably we'd need some more primitives to work on these topics :) Maybe a backup and dump CRD to start with, then a backup controller, where stash could be one implementation. And finally, a dump/restore controller for mysql shipped with the mysql chart.

All f this is of course highly hypothetic and would be nice in an ideal world!

But once we have that, we coudl replace mysql with postgres, mongo, mariadb, elastic..

tamalsaha commented 6 years ago

@pierreozoux , for database backups we recommend using something like https://kubedb.com . Database backups usually require using db level tools for safe backup and restore. KubeDB has backup/restore support for common databases https://kubedb.com/docs/0.8.0-beta.2/guides/mysql/snapshot/backup-and-restore/

Miouge1 commented 6 years ago

@pierreozoux I don't think that 2 volumes are needed. You can use spec.fileGroups to filter only the backup folder on your DB volume.

Also this can be solved with #351

Miouge1 commented 6 years ago

As a feature proposal Stash could pick-up annotations like:

annotations:
  stash.appscode.com/backup-cmd-stdout: "mysqldump"
  stash.appscode.com/restore-cmd-stdin: "mysql"

That could trigger the mysqldump command (or whatever script you want) and backup the output (stdout) of that. On the restore side, it would the the mysql command and pass the backup file as a stdin.

pierreozoux commented 6 years ago

@Miouge1 actually, appscode also has kubedb project that does dumps and restore automatically.

It is just about integrating kubedb dump and stash features. I don't know yet what should be done, but for sure, there is a better way than now. Come to the slack channel of kubedb if you have some idea on how we could improve the process!

Miouge1 commented 6 years ago

I plan to backup PVCs used for OpenLDAP, cockroachdb, redis, and mysql. AFAIK KubeDB covers only MySQL/Postgres/Redis. So a generic command thing would be really helpful.

tamalsaha commented 6 years ago

xref: https://github.com/kubedb/project/issues/168

giovannicandido commented 5 years ago

This would also solve consul snapshots, not only databases. I like the idea of central backup tool with one source for management, monitoring and alerts.

Stash could pipe the stream to restic and "tag" the backup with the date and time. When a restore is necessary the restore command will receive the pipe from a specific backup file.

tamalsaha commented 5 years ago

Design is finalized here: https://github.com/appscode/stash/issues/648

tamalsaha commented 5 years ago

Dup https://github.com/appscode/stash/issues/648