gilbertchen / duplicacy

A new generation cloud backup tool
https://duplicacy.com
Other
5.27k stars 340 forks source link

Feature request: Yubikey support #534

Open carniz opened 5 years ago

carniz commented 5 years ago

In short, it would be awesome if backup sets could be encrypted with hardware keys (like the Yubikey).

This way there would be no need for clear-text passwords in .duplicacy/preferences or as environment variables for automated backups, as long as the Yubikey is plugged in.

vb0 commented 5 years ago

As far as I know Yubikey is only for authentication not symmetrical encryption. So you could use it to log in to services like Google Drive but not for doing symmetrical encryption. The power of such devices is that they never give their secrets to the PC, they just take some challenge and calculate a result based on the secret they have inside without giving out the secret. If you want the capability to store some small keys/configuration files on a small stick there are plenty that brag about hardware encryption - but I would just use LUKS/bitlocker and any regular small USB drive.

carniz commented 5 years ago

At least the Yubikey version 5 series can hold OpenPGP keys, which of course means asymmetric encryption (and not symmetrical encryption, as is the current case with duplicacy). I don't know how much work it would be to add support for asymmetric encryption, but I would feel much more relaxed with only the public key exposed in the preferences file for automated system backups (i.e. scheduled cron jobs that run from non-login shells). This is currently the only weakness of duplicacy, IMHO.

vb0 commented 5 years ago

A backup program that has no access to previous backups would be extremely cumbersome and lack lots of features. If you are really serious about this not only you can do it easily but any way to do it is something equivalent to this: just make a pgp encrypted archive towards a stored public key (and keep the secret key in some other place). If it means a full backup each time so be it. In practice if people trust a process to read the originals they also trust it to read the backups. There are ways to improve this but in a different direction - the "read both" requirement remains if you want any efficiency (at least to the point where you can tell what's new data and what's old data). For example you can use a read-only share and backup from the second machine so no malware from either machines ca write anything on the other machine.

carniz commented 5 years ago

I can somehow buy the argument that it makes sense to trust a process to read the backups if you trust it to read the originals - with the exception that the backups may contain secrets deleted from the source/original/live file system.

But the question is how does Duplicati (which uses GPG, see https://github.com/gilbertchen/benchmarking) do incremental backups, if it can't decrypt the previous backups? By comparing file fingerprints/checksums only? Caveat: I haven't tried to use Duplicati, so it might be that it requires the private key to be present at all times and not only during restores.

vb0 commented 5 years ago

Duplicati has a local SQL database that keeps the hashes of all blocks and I presume otherwise all the metadata of the files (so it doesn't have to read the backup in order to know what changed). Even if in theory it should speed things up it's the root cause of many performance issues.

I don't know that much about it either but I'm pretty sure "encrypt database" here means "encrypt backup" or "backed up database" or something similar: https://github.com/duplicati/duplicati/wiki/Re-encrypt-database - the actual local database (containing also information about what's in the backup) will still be available to the process when making regular backups.