pravega / zookeeper-operator

Kubernetes Operator for Zookeeper
Apache License 2.0
366 stars 203 forks source link

[Enhancement] Implement automatic backup #408

Open ibumarskov opened 2 years ago

ibumarskov commented 2 years ago

Description

Automatic backup of zookeeper db is a useful mechanism in demand in many real-world scenarios. It would be great to implement a separate controller responsible for automatic backup by schedule or on-demand backups. As the simplest implementation we can provide backup of ZooKeeper by copying its transaction and snapshot logs. It's a reasonable method, there is a short article regarding backup/restore procedure for zookeeper: Apache ZooKeeper Backup, a Treatise

Importance

Should-have

Location

Zookeeper operator.

Suggestions for an improvement

Separate controller can be added to watch for a new type of resource (zookeeper-backup for example). Controller should take following responsibilities:

Backup script should contains following steps:

  1. May contain some additional checks (for example check that it really landed on pod with leader, check datadir folder exists, check free storage space, etc.)
  2. Create archive of datadir folder and put it in a backup folder (It should be a volume intended for storage backups)
  3. Check amount of backups and delete oldest if necessary (should be specified in zookeeper-backup cr)
anishakj commented 2 years ago

@ibumarskov Do you find the need of backup when we are using ephemeral storage? In the case of persistent volumes , data is not lost even if there is a crash.

ibumarskov commented 2 years ago

You can lost data in different ways. Data failures can be the result of hardware or software failure, data corruption, or a human-caused event, such as a malicious attack or accidental deletion of data. So, data backup is a good practice.

anishakj commented 2 years ago

@ibumarskov Would you like to contribute adding backup support by raising a PR.

ibumarskov commented 2 years ago

@anishakj Yes, I'm going to provide an implementation soon.