fort-nix / nix-bitcoin

A collection of Nix packages and NixOS modules for easily installing full-featured Bitcoin nodes with an emphasis on security.
https://nixbitcoin.org
MIT License
492 stars 101 forks source link

Backups while the services are runnnig #409

Open seberm opened 2 years ago

seberm commented 2 years ago

Hello, maybe I am just missing something, but do the backups using the backups service make sense? Have you ever tried to recover some of the services from these duplicity backups?

My point is that using the backups service I can copy all of the important data dirs. But are the services (like clightning, bitcoind, electrs, ...) stopped during the backup?

It seems that the cping files while the services are running can easily lead to a corrupted backup files that cannot be recovered from.

For example, what is the current recommended way to backup the c-lightning's channel states? The channels states are saved in /var/lib/clightning/bitcoin/lightningd.sqlite3 database. Without it I am not able to recover the in-channel funds. It does not make sense to backup this file while clightning is running:

Another example is running a backup with an services.backups.with-bulk-data option enabled. Does it make sense to backup chainstate/* or blocks/index/* files while bitcoind is still running? I am almost sure that the levelDB files in the backup will be corrupted.

What are your thoughts on this? Shouldn't we stop these services during the backup or at least allow the user to parametrize this behaviour?

Please, this is not a criticism, I really appreciate your work!

Thanks, Ota

nixbitcoin commented 2 years ago

Please, this is not a criticism, I really appreciate your work!

This is very good feedback. Even if it were criticism, we appreciate helpful comments.

Improving backups is one of our top-priorities. The two lightning node implementations are going to get custom back up solutions in nix-bitcoin in the coming days/weeks.

The other files you mentioned should probably only be backed up when the services are down. What do you think @jonasnick?

jonasnick commented 2 years ago

Thanks for pointing this out @seberm.

The other files you mentioned should probably only be backed up when the services are down.

At least if you're using the backup module as it is right now, I agree (would work with running services if we could do a ZFS-like snapshot). This behavior should also be mentioned in the sample configuration. Having it parameterizable makes sense I think because there's a reasonable probability that the backup is not entirely corrupt. However, one should certainly not rely on the backup plugin backing up lightning channel states.

seberm commented 1 year ago

Hello everyone, what is the current status of clightning backups?

I am aware there is a new clightning-replication module which replicates the current state of the clightning's DB to a sshfs or a local directory (optionally transparently encrypted using a gocryptfs).

But still, these are live backups of clightning's sqlite DB, is that correct? Which means that these replicas may still result in a corrupted backup file that cannot be recovered from.

I was thinking about some kind of an offline-backup module/service which could regularly (e.g. every week?) stop all (or chosen) services, backup the data (over sshfs or locally) and start the services again? Or do you have some better idea how to solve this issue?

I know it's not a clean approach and I am basically fine with e.g. reindexing the levelDB files in chainstate/*, blocks/index/* or electrs index. But I am not ok with a possibility of loosing the offchain (channel) funds.

Thank you!

jonasnick commented 1 year ago

Hi,

The existinng clightning-replication module should not result in corrupted backups. It uses a sqlite3's feature that allows writing to a second database file. This second database should always be consistent with the first. Other services than clightning can still get into a corrupted state when restoring a backup from the backups module.

seberm commented 1 year ago

Hello, this is great, I have overlooked the line https://github.com/fort-nix/nix-bitcoin/blob/master/modules/clightning-replication.nix#L138

Thanks for your feedback!