Improve qvm-backup key derivation/management

andrewdavidwong commented 9 years ago

See: https://groups.google.com/d/msg/qubes-devel/CZ7WRwLXcnk/u_rZPoVxL5IJ

andrewdavidwong commented 8 years ago

Just to summarize everything so far, it sounds like there are ~~two~~ three main options on the table:

1. Use GPG for encryption and OpenSSL for verification

Pros:

Uses existing tools
Requires no crypto expertise and no building; just using pre-installed binaries
These tools are ubiquitous and widely-used
These tools are well-tested and widely-respected (for the proposed functions)

Cons:

Does not result in the strongest possible KDF
- (It is debatable whether this is really a "con." As we established above, GPG's KDF is fine if you're using a decent passphrase, which you should be. And it's also not much of a con if you're skeptical of newfangled KDFs which haven't yet stood the test of time.)
Requires two different tools
- (This may not matter, since the two tools are so ubiquitous that there is unlikely to be a situation where you won't have access to both.)
Still relies on OpenSSL
- The problems with OpenSSL are with the enc function (poorly-written, little-used). If, by contrast, the HMAC-related functions are better-written and more widely-used, then this may not be a problem.

2. Use scrypt for everything

Pros:

Includes a very strong, modern KDF
Requires no crypto expertise
One tool for everything
Reputable author

Cons:

Significantly less ubiquitous than the other tools, making manual backup recovery less certain and harder for users
Uses a custom implementation of AES
- (Less tested, not widely-used, fewer eyes on the code to catch bugs)
Still not the strongest, most modern KDF
- (As mentioned above, this may not be a con.)
Introduces more complexity than simply using pre-installed tools
- (More chances for mistakes to be made)
[Marek] It isn't easy for scripted usage - reads password from /dev/tty. I can workaround this, but it will not be nice code...

3. Use the current system, but pass `-md sha256` to `openssl enc`

Pros:

This is the new default in OpenSSL 1.1.0
Requires the least amount of developer time and effort (and minimal code changes)
Allows us to keep using the same tool without introducing any new ones
sha256 is much less likely to reduce entropy of user's passphrase (compared to md5)

Cons:

Still hardcoded to one iteration
Still uses the same passphrase for verification and encryption
- As discussed below, this could be avoided by passing sha512(passphrase) to openssl dgst.
Still relies on OpenSSL
- But at least now we evidence that enc isn't totally neglected.

(As I mentioned above, when it comes to something like backup encryption, I'd prefer to see Qubes stay on the conservative side, ~~so, FWIW, I'm leaning toward option 1~~. Option 3 might be better for now, if it can be done immediately at little cost yet provide a pareto improvement over the current system.)

marmarek commented 8 years ago

Still relies on OpenSSL

The problems with OpenSSL are with the enc function (poorly-written, little-used). If, by contrast, the HMAC-related functions are better-written and more widely-used, then this may not be a problem.

If I understand correctly, using openssl for verification still requires some KDF...

One more scrypt cons:

it isn't easy for scripted usage - reads password from /dev/tty. I can workaround this, but it will not be nice code...

Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?

andrewdavidwong commented 8 years ago

If I understand correctly, using openssl for verification still requires some KDF...

I think that depends on what we consider the "requirements" to be. Currently, we use OpenSSL for verification without any kind of KDF. We just feed the user's passphrase directly to openssl dgst -hmac.

One option would be to keep doing this. Since GPG applies its own KDF (S2K) to the passphrase, the resultant verification and encryption keys would be significantly different. However, the passphrases fed to openssl and gpg would still be the same, and that's undesirable. (E.g., exploiting a flaw in OpenSSL's implementation of HMAC-SHA512 which allows the attacker to recover the passphrase would allow her to then immediately decrypt the GPG-encrypted data. Unlikely, perhaps, but easily avoided by simply by using two different passphrases.)

OpenSSL is really just there to protect GPG from unverified input. So, if all we really want is for the two passphrases to be different, a simple solution could be something like: feed sha512(passphrase) to openssl and feed passphrase to GPG (on which GPG will then apply its own KDF).

IMHO, our goal isn't to apply our own key-stretching (we trust the user to supply a reasonable passphrase). Rather, it's just to avoid "key-shortening" (which the openssl enc's single round of md5 currently does in many cases) and to avoid the "same passphrase" problem mentioned above.

Either way, it would still better than what we do right now. Plus, a solution like this ensures that disaster recovery remains possible without yet another external tool (i.e,. the user doesn't need a copy of exotic-kdf-tool, just the ability to compute sha512(their_passphrase)).

It may not be "ideal," but it seems like it would fix the main problems we currently have, which are: (1) the issues with openssl enc (entropy-reducing "KDF," code neglect) and (2) using the same passphrase for verification and encryption. I worry that any more "ideal" solution would require much more code and complexity (and therefore not really be ideal).

One more scrypt cons:

it isn't easy for scripted usage - reads password from /dev/tty. I can workaround this, but it will not be nice code...

Added.

andrewdavidwong commented 8 years ago

OpenSSL changes between 1.0.2g and 1.1.0:

  *) Changed default digest for the dgst and enc commands from MD5 to
     sha256
     [Rich Salz]

This reminds me that there's a third option: simply start passing -md sha256. I'll edit this back into the previous post, along with pros and cons.

marmarek commented 8 years ago

I think that depends on what we consider the "requirements" to be. Currently, we use OpenSSL for verification without any kind of KDF. We just feed the user's passphrase directly to openssl dgst -hmac.

Isn't that exactly what we're trying to solve in this ticket? Weak/no KDF usage for (any of) authentication or encryption. If I understand correctly, this makes attack easier because someone may launch attack on password by just attacking hmac, then use guessed password for decryption. So if any of those will be weak (cheap to launch dictionary/bruteforce attack), it will help with attacking the other part, even if decent KDF is used in that other part.

As for -md sha256 idea - we already pass -sha512 to openssl dgst. But not openssl enc - here it will indeed somehow improve the situation. At least it will not reduce passphrase to 128 bits. Also, indeed some idea may be using SHA512(passphrase) for both operations. Or even SHA512(passphrase + "hmac") and SHA512(passphrase + "enc"). This will produce different keys. This looks like solution for the original problem, but as I'm not a cryptographer, I don't know if generally a good idea... Similar idea was raised in https://github.com/QubesOS/qubes-issues/issues/971#issuecomment-151125927

andrewdavidwong commented 8 years ago

Isn't that exactly what we're trying to solve in this ticket? Weak/no KDF usage for (any of) authentication or encryption.

Well, it's one of the issues. The current three problems are:

md5(passphrase) is capping entropy at 128 bits.
Same passphrase is being fed to dgst and enc.
openssl enc seems shoddy (admittedly this one is debatable).

If I understand correctly, this makes attack easier because someone may launch attack on password by just attacking hmac, then use guessed password for decryption. So if any of those will be weak (cheap to launch dictionary/bruteforce attack), it will help with attacking the other part, even if decent KDF is used in that other part.

Yes, this is almost exactly the same situation I mentioned above.

But, again, there is a big middle-ground between (a) passing the same raw passphrase to dgst and enc, and (b) adding some full-blown KDF to the process. One example of such a middle-ground solution is using, e.g., sha512(passphrase) (discussed below).

As for -md sha256 idea - we already pass -sha512 to openssl dgst. But not openssl enc - here it will indeed somehow improve the situation. At least it will not reduce passphrase to 128 bits.

Yes, and it seems like there's no reason not to do this immediately. After all, it will happen by default if/when we upgrade to OpenSSL 1.1.0. Might as well start now and reap some benefit in the meantime.

Also, indeed some idea may be using SHA512(passphrase) for both operations. Or even SHA512(passphrase + "hmac") and SHA512(passphrase + "enc"). This will produce different keys. This looks like solution for the original problem, but as I'm not a cryptographer, I don't know if generally a good idea... Similar idea was raised in #971 (comment)

It may not be optimal from a cryptography standpoint, but surely it is better than what we do now (passing the same bare passphrase to both dgst and enc). If it would be trivial to implement, then it seems like we have nothing to lose and a fair amount to gain. Why not pick the low-hanging fruit?

ag4ve commented 8 years ago

Any reason not to just use a 3rd party OSS tool for this like duplicity? Alternatively, I believe rsync and LUKS containers would fit the bill as well. On Mar 17, 2016 6:22 PM, "Axon" notifications@github.com wrote:

Isn't that exactly what we're trying to solve in this ticket? Weak/no KDF usage for (any of) authentication or encryption.

Well, it's one of the issues. The current three problems are:

md5(passphrase) is capping entropy at 128 bits.

Same passphrase is being fed to dgst and enc.

openssl enc seems shoddy (admittedly this one is debatable).

If I understand correctly, this makes attack easier because someone may launch attack on password by just attacking hmac, then use guessed password for decryption. So if any of those will be weak (cheap to launch dictionary/bruteforce attack), it will help with attacking the other part, even if decent KDF is used in that other part.

Yes, this is almost exactly the same situation I mentioned above.

But, again, there is a big middle-ground between (a) passing the same raw passphrase to dgst and enc, and (b) adding some full-blown KDF to the process. One example of such a middle-ground solution is using, e.g., sha512(passphrase) (discussed below).

As for -md sha256 idea - we already pass -sha512 to openssl dgst. But not openssl enc - here it will indeed somehow improve the situation. At least it will not reduce passphrase to 128 bits.

Yes, and it seems like there's no reason not to do this immediately. After all, it will happen by default if/when we upgrade to OpenSSL 1.1.0. Might as well start now and reap some benefit in the meantime.

Also, indeed some idea may be using SHA512(passphrase) for both operations. Or even SHA512(passphrase + "hmac") and SHA512(passphrase + "enc"). This will produce different keys. This looks like solution for the original problem, but as I'm not a cryptographer, I don't know if generally a good idea... Similar idea was raised in #971 https://github.com/QubesOS/qubes-issues/issues/971 (comment)

It may not be optimal from a cryptography standpoint, but surely it is better than what we do now (passing the same bare passphrase to both dgst and enc). If it would be trivial to implement, then it seems like we have nothing to lose and a fair amount to gain. Why not pick the low-hanging fruit?

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/QubesOS/qubes-issues/issues/971#issuecomment-198107642

Rudd-O commented 8 years ago

If I may for a moment explain -- I do use qvm-backup but I am unhappy with it, because there is no support for encrypted incremental backups. This means I have to back up close to a gig of stuff EVERY TIME. This is slow, it prevents me from using the machine being backed up, and it's so fucking tedious, I rarely do it.

We need a better solution that will (a) support running the backup concurrently with the underlying VMs running (b) support scripting and sending off to external storage beyond just USB drives (c) support incremental backup in the right way, so that backups can be finished faster. I believe what this means is embracing some sort of WAFL / COW file system, and backing it up by replicating (send/recv) to a remote, encrypted disk or disk image. qvm-backup just doesn't cut the mustard.

marmarek commented 8 years ago

I agree that requirement to shutdown all the VMs for the backup is not convenient at least...

(c) support incremental backup in the right way, so that backups can be finished faster.

There was a discussion about it linked to #858

(a) support running the backup concurrently with the underlying VMs running

This is also considered as part of #858

(b) support scripting and sending off to external storage beyond just USB drives

This is already possible - you can enter some command instead of directory path and the backup will be streamed to its stdin. For example ssh somehost dd of=/path/on/remote/machine.

marmarek commented 8 years ago

It may not be optimal from a cryptography standpoint, but surely it is better than what we do now (passing the same bare passphrase to both dgst and enc). If it would be trivial to implement, then it seems like we have nothing to lose and a fair amount to gain. Why not pick the low-hanging fruit?

One reason: to not change the backup format too frequently (compatibility, number of combinations to test). There will be new backup format (version += 1) in Qubes 4.0, because qubes.xml format is changed. So we can bundle this change with fixing problem discussed in this ticket.

For now I'll ignore the first option from your list (gpg+openssl), as it looks to only introduce complexity while the same gains can be achieved by much simpler option "3" (openssl enc -md sha512 + sha512(passphrase + 'hmac') for openssl dgst).

So, we have two options:

use scrypt (option "2") - probably the best of those considered from cryptography POV, but with some practical drawbacks as described in https://github.com/QubesOS/qubes-issues/issues/971#issuecomment-197790115
Slightly modify the current system (option "3") - easiest to implement

The question is, whether option "3" is good enough? It is surely better than the current implementation...

andrewdavidwong commented 8 years ago

@Rudd-O: Thank you for sharing your experience! I agree with you and @marmarek about the inconvenience of the current system. I also think that an incremental backup system is desirable, but there would still be a need for the ability to create immediate full backups (e.g., for system migration).

One reason: to not change the backup format too frequently (compatibility, number of combinations to test). There will be new backup format (version += 1) in Qubes 4.0, because qubes.xml format is changed. So we can bundle this change with fixing problem discussed in this ticket.

Ah, that's a good point. Changing the format too frequently would be bad. Ok, so bundling with 4.0 sounds good.

The question is, whether option "3" is good enough? It is surely better than the current implementation...

IMHO, we need to be careful not to let the perfect be the enemy of the good. Option 3 makes the current system better without making anything worse. So, if option 3 is not good enough, then the current system falls even shorter of being good enough. In light of this, I can only think of a few reasons not to go with option 3 (but maybe there are more):

We'd rather put the development time into other things.
We expect a better solution will fall into our laps soon.
We don't want to change the backup format only to have to change it again soon after.

All can be legitimate reasons, of course, but IIUC, the development time would be pretty minimal, and we don't expect a better solution to fall into our laps anytime soon. The last two reasons are likely to be related. (If we thought a better solution was about to fall into our laps, we'd want to hold off so that we don't have to change the format again so quickly afterward.) But this attitude can also be taken to an extreme. It can paralyze us with the fear that something better is always over the horizon, so we never make the easy improvements that we can make.

andrewdavidwong commented 8 years ago

Ok, I thought of one more potential concern about option 3:

Even though option 3 seems simple, nothing in cryptography is ever really simple. Our proposal is to compute sha512(passphrase + 'hmac') and feed it to openssl dgst as a passphrase. There are various problems associated with using a short, static salt (e.g., the birthday attack) and using a standard digest algorithm to hash passwords (instead of a true KDF). Many of these problems are more relevant to protecting online databases than to file encryption at rest, but the point is that we don't know enough about cryptography to know whether a seemingly innocuous move (like adding 'hmac' as a salt) could somehow decrease security. For example, there are even tricky problems that arise with how you concatenate strings.

In other words, maybe even the simplest option (3) requires more crypto expertise than we have available to us.

The standard advice is to use an actual KDF rather than trying to roll your own. Our problem is that we need something commonly available that can be used from the command-line, and apparently nothing like that is available.

Except that's not entirely true. OpenSSL includes the command passwd. From the man page:

The passwd command computes the hash of a password typed at run-time or the hash of each password in a list. The password list is taken from the named file for option -in file, from stdin for option -stdin, or from the command line, or from the terminal otherwise. The Unix standard algorithm crypt and the MD5-based BSD password algorithm 1 and its Apache variant apr1 are available.

The problem is that it uses only very old and weak algorithms (md5 and crypt), which would defeat the purpose of passing -md sha256 to enc, since the initial step would cap user passphrase entropy at 128 bits (and use a weak set of algorithms, to boot).

marmarek commented 8 years ago

openssl passwd by default uses a random salt, so it isn't good for KDF. There is an option for using static salt (provided on command line), but then again - we'd need somehow to carry this salt (in backup header?). The same problem which is solved internally by scrypt (it uses own header in encrypted files), or in case of just scrypt KDF (connected with openssl enc) - by storing its parameters in backup header.

Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?

andrewdavidwong commented 8 years ago

@ag4ve:

Any reason not to just use a 3rd party OSS tool for this like duplicity? Alternatively, I believe rsync and LUKS containers would fit the bill as well.

IIUC, the main problems with those options are:

Dom0 has no network access (by design).
The backup shouldn't have to be stored in dom0 (in its entirety, for any length of time)(since there might not be enough space, among other reasons).
The backup must be encrypted before it leaves dom0.
Upon restore, the backup needs to be verified (for integrity and authenticity) before the data is significantly parsed (GPG fails here). (Otherwise, an attacker could swap your backup with a malicious payload, which then compromises dom0 when you attempt to restore from it.)

andrewdavidwong commented 8 years ago

In other words, maybe even the simplest option (3) requires more crypto expertise than we have available to us.

Clarification: This applies only to the passphrase hashing part. The part about passing -md sha256 to enc should still be done, if nothing else.

defuse commented 8 years ago

Hey! I use Qubes backups heavily (I'm restoring from a backup as I type this), so I'm willing to provide some free consulting on this issue. I just skimmed this thread, but I understand:

We're worried about a weakness in openssl enc's password-to-key KDF.
It's crucial to retain ease of decrypting with standard tools.

The main concern I see being brought up is OpenSSL's use of MD5 and that this might cap security at 128 bits. This isn't the right thing to be worrying about. I'm looking at the command line we use for encryption and the source code for openssl enc which calls EVP_BytesToKey. According to the algorithm described in this documentation, no entropy would actually be lost assuming MD5 is secure. I am fairly confident that the known weaknesses in MD5 do not significantly decrease security here either. In other words, option 3 of adding -md sha256 to the openssl enc command gains us negligible security benefits. In either case, a good 256-bit passphrase is still 256 bits of security, not 128. (But using -md sha256 is a good thing to do anyway, just in case.)

I'm concerned about the following things:

Is a random salt being used right now? openssl enc should spit out a salt unless you pass the nosalt option, and then that salt should be provided upon decryption. Could you point me to the code that does this? This is probably the most important thing Qubes needs to be doing. (Edit: Actually I may be misreading the OpenSSL code.)
Using separate systems (GnuPG and OpenSSL) for encryption and authentication, where one does key stretching and the other doesn't, negates the benefit of key stretching. Key stretching is really important, mandatory I would even say, when you're turning a passphrase into a key.
I'd like to verify how the HMAC is being computed/checked (please point me to code!).

I strongly recommend switching to something with decent key stretching (e.g. at least PBKDF2, or whatever the thing in GnuPG is). I'm willing to provide advice on this. Could someone point me to where the relevant code is (for creating a backup, and then restoring it)? Thanks!

defuse commented 8 years ago

To answer my own concern (1) above, the passphrase effectively isn't protected by a salt. According to these instructions for manual restoration of the backup, there's a file called backup-header whose contents are the same for all users having the same Qubes version. Inside a backup-header.hmac file there's the result of running openssl dgst -sha512 -hmac "your_passphrase" backup-header.

This enables precomputation attacks: An attacker could take the contents of backup-header for the most common Qubes version and build up a database (or rainbow tables) of what the backup-header.hmac contents would be under different passphrases. They could then use that database to quickly map the victim's backup-header.hmac contents back to their password, if it's in the database (or rainbow table).

Edit: use better wording

defuse commented 8 years ago

Strong +1 to option 2 of using scrypt. Speaking as a user, I'm very willing to download and build the scrypt tarball to get the increased security. It's even in the debian repositories.

andrewdavidwong commented 8 years ago

I use Qubes backups heavily (I'm restoring from a backup as I type this), so I'm willing to provide some free consulting on this issue.

Thank you!

The main concern I see being brought up is OpenSSL's use of MD5 and that this might cap security at 128 bits. This isn't the right thing to be worrying about. [...] In either case, a good 256-bit passphrase is still 256 bits of security, not 128.

Can you clarify that? Suppose my passphrase is a truly random 256-bit string. openssl enc takes md5(passphrase), resulting in a 128-bit string. How have I not lost entropy here?

Here's the backup code: https://github.com/QubesOS/qubes-core-admin/blob/master/core/backup.py

soatok commented 8 years ago

How have I not lost entropy here?

See:

This isn't the right thing to be worrying about.

andrewdavidwong commented 8 years ago

How have I not lost entropy here?

See:

This isn't the right thing to be worrying about.

These are two logically distinct claims:

md5(passphrase) doesn't cap entropy at 128 bits.
md5(passphrase) caps entropy at 128 bits, but that's not the right thing to be worrying about.

defuse commented 8 years ago

@andrewdavidwong: I'm making claim number (2). md5(passphrase) certainly caps entropy at 128 bits. openssl enc is doing something different. What openssl enc uses as the key is...

md5(passphrase || salt) || md5(md5(passphrase || salt) || passphrase || salt)

...according to the fact it uses EVP_BytesToKey and these EVP_BytesToKey docs. (The source code is here but it's pretty terrible to read.). There are probably other problems with this KDF, but it's not obvious that entropy is capped at 128 bits, since the passphrase is used once to make the first 128-bit half and then again in a different way to make the second 128-bit half. The first and second halves are both capped at 128 bits but the end result probably isn't.

andrewdavidwong commented 8 years ago

Ok, that helps clarify things for me. Thanks!

tomrittervg commented 8 years ago

Hi all. I was also pointed at this thread. I run NCC Group's Cryptography Services practice. (And actually see a reference to NCC earlier - but we never got connected apparently. That's a shame, I'd have been happy to help.)

@defuse is pretty smart, and his statements all seem reasonable to me, but I might reinforce/add to them:

1) Why are you using CBC+HMAC separately? Why not use GCM mode? Even if there's a reason to have the authenticity check separate, I would recommend encrypting with GCM mode anyway to avoid some sort of time-of-check-time-of-use bitflipping attack.

2) One passphrase is used to feed into both the HMAC and the encryption? If the key derivation steps are the same for both, you'd be using the same key for both. That's not ideal. The correct way to derive separate keys from identical input entropy is with HKDF (https://tools.ietf.org/html/rfc5869) which, haha, is not trivial to implement. At the bare minimum, one should do something like HMAC(entropy, "encryption key") and HMAC(entropy, "hmac key") and use the output of those as the respective keys.

That said - this type of academic analysis (why exactly HKDF is good, and HMAC(entropy, "encryption key") or SHA(entropy | "encryption key") are bad) has never been my strong suit. I don't believe there's practical attacks here but... well that's why we go with the safe stuff in crypto.

3) I agree that Key Stretching, and the use of a unique salt when using a passphrase, is crucial for security. Salt to avoid precomputation and stretching to make brute forcing much more painful. PGP's stretching algorithm (S2K Iterations) is an odd standard, but it's a standard. It's about on par with PBKDF2 in terms of security. Scrypt is better - you get (some) memory hardness. I'm not familiar enough with scrypt to recommend parameters though, but I bet you could find someone to help you tweak things to your needs. Maybe a judge or participant in https://password-hashing.net/

Omitting key stretching removes the safety net for users. With key stretching a user with an 'okay' passphrase might be safe, without there's a much higher chance of getting cracked. So perhaps you decide that operating without a safety net is okay for the time being, and you come back to it later - but I wouldn't choose to forgoe it forever.

andrewdavidwong commented 8 years ago

Thanks, @tomrittervg and @defuse! We fully agree that the system is flawed in the ways you two have helpfully pointed out. Our current problem is that we don't have anyone with the relevant expertise who's willing to help us implement a better system. Is that something that either of you, or anyone you know, would be willing to do?

defuse commented 8 years ago

I'm super busy at the moment, so I can't commit to being able to do anything right now. Using Colin Percival's scrypt tool should be pretty straightforward and hard to mess up, and I'd be happy to review any implementation that comes into existence.

andrewdavidwong commented 8 years ago

Using Colin Percival's scrypt tool should be pretty straightforward and hard to mess up, and I'd be happy to review any implementation that comes into existence.

@marmarek, what do you think?

marmarek commented 8 years ago

Last time I've checked it wasn't easy to provide passphrase from anything but terminal (it was reading it from /dev/tty). So not trivial to integrate. But probably doable.

Rudd-O commented 8 years ago

Duplicity does not need to have network access to work. It just needs to have a backend specific to inter-VM backup:

https://bazaar.launchpad.net/~duplicity-team/duplicity/0.7-series/view/head:/duplicity/backends/ssh_pexpect_backend.py

Rudd-O commented 8 years ago

SRC RPM for tarsnap (containing scrypt) https://koji.rpmfusion.org/koji/buildinfo?buildID=688

marmarek commented 8 years ago

SRC RPM for tarsnap (containing scrypt) https://koji.rpmfusion.org/koji/buildinfo?buildID=688

There is also package for fc23, tagged with "f23-nonfree". I wonder what this means in practice... Will it be in some separate repository?

Rudd-O commented 8 years ago

Nonfree means things built outside Amerika because of intellectual monopoly bullshit.

To the extent that I know, that may just be a miscategorization, as tarsnap is bona fide open source which you can download from the tarsnap site.

marmarek commented 7 years ago

If going with scrypt (which looks like the best option according to above comments), I see it:

would replace having separate .hmac files, as scrypt handle this already
as a consequence will make impossible to create non-encrypted backup (not a problem at all)
not having backup-header.hmac will make slightly harder to handle both old and new backups with the same tool, but this is still doable.

This change makes security of scrypt tool very critical to security of Qubes backups. Both in terms of used encryption, and correctly handling (potentially malicious) data during decryption.

At the same time we can simplify backup format even more: get rid of inner tar layer. Currently it serves two purposes:

Efficiently handle sparse files. This can be replaced by a compression. Somehow less effective, but still - stream of compressed zeroes isn't that big. And when using pigz instead of just gzip it isn't that bad (but of course still much slower than simply not storing it at all)
Authenticate file name inside (inner tar archive is integrity protected, while outer one isn't). This can be replaced by prefixing password (given to scrypt for given file) by expected file name (and some separator, like \x01). scrypt authenticate file content using this password, so swapping files should be mitigated here.

Attack not mitigated by any of those, is replacing whole VMs with the same VM from older/newer backup (created with the same passphrase). This is not much different than replacing the whole backup archive. Can be mitigated by user by using different passphrases for different backups (like append a number, or a date).

Another reason to drop inner tar layer - it is no longer effective if source isn't sparse file, but LVM thin volume (which will be the case in Qubes 4.0). Actually I haven't found out yet how to create tar archive from block device content, without dumping it to a file first (using python tarfile module isn't any better, as it doesn't support sparse files).

Some benchmark about tar/gzip/pigz:

[user@testvm ~]$ truncate -s 1G sparse.file
[user@testvm ~]$ time tar cS sparse.file |wc -c
10240

real    0m0.041s
user    0m0.002s
sys 0m0.030s
[user@testvm ~]$ time gzip < sparse.file |wc -c
1042069

real    0m15.087s
user    0m14.211s
sys 0m0.881s
[user@testvm ~]$ time pigz < sparse.file |wc -c
1171473

real    0m9.444s
user    0m31.015s
sys 0m2.699s

andrewdavidwong commented 7 years ago

Attack not mitigated by any of those, is replacing whole VMs with the same VM from older/newer backup (created with the same passphrase). This is not much different than replacing the whole backup archive. Can be mitigated by user by using different passphrases for different backups (like append a number, or a date).

That's a good point (added to documentation).

IMHO, neither variety should be considered a critical attack, since in any case a backup authenticated by the user's passphrase is trusted insofar as the user has chosen to create the backup using that passphrase.

One thing to note is that it's possible to "DoS" someone who uses a different passphrase for each backup (e.g., date appended) by changing around the file names of their backups. This is a different kind of DoS from simply deleting all of their backups, since the victim won't realize what's happened unless/until they try to restore from one of the backups.

Some benchmark about tar/gzip/pigz: [...]

How should these results be read? For example, what's the difference between "real" and "user"?

marmarek commented 7 years ago

One thing to note is that it's possible to "DoS" someone who uses a different passphrase for each backup (e.g., date appended) by changing around the file names of their backups. This is a different kind of DoS from simply deleting all of their backups, since the victim won't realize what's happened unless/until they try to restore from one of the backups.

Mostly the same as filling the backup file with junk data...

Some benchmark about tar/gzip/pigz: [...]
How should these results be read? For example, what's the difference between "real" and "user"?

"real" is actual time spent on the operation (end_time-start_time), "user" is CPU time used (including multiple cores etc - 1s of fully utilizing 4 cores is 4s "user" time and 1s "real" time). "sys" is time spent on system calls (kernel code).

Rudd-O commented 7 years ago

@marmarek the gains from sparse file storage aren't so much on the read (backup) side (though, unlike your short benchmark, the are quite big if you have rotational disks like I do). They are mostly on the write side. When your system has to write an enormous file to disk, and allocate those zeroes that should be unallocated, you end up spending a MONSTROUS amount of disk space and disk activity just to store those zeroes. On a system like Qubes, which relies on thin provisioned storage, getting rid of sparse file storage is a bad idea.

Maybe a purpose-built, short C or Go program, that reads from device files and writes tar format to its output, is the right thing to use here. It avoids using tar directly, it can detect rows of zeroes and output them as sparse blocks, and it isn't needed during restore (as you can use tar directly in that case). Those are my thoughts. What do you think?

marmarek commented 7 years ago

@marmarek the gains from sparse file storage aren't so much on the read (backup) side (though, unlike your short benchmark, the are quite big if you have rotational disks like I do). They are mostly on the write side. When your system has to write an enormous file to disk, and allocate those zeroes that should be unallocated, you end up spending a MONSTROUS amount of disk space and disk activity just to store those zeroes. On a system like Qubes, which relies on thin provisioned storage, getting rid of sparse file storage is a bad idea.

This isn't a problem. dd conv=sparse in the middle does the trick.

Maybe a purpose-built, short C or Go program, that reads from device files and writes tar format to its output, is the right thing to use here. It avoids using tar directly, it can detect rows of zeroes and output them as sparse blocks, and it isn't needed during restore (as you can use tar directly in that case). Those are my thoughts. What do you think?

This is what I'm currently exploring, as I've failed to find any other method (tried many tar implementations, other archive formats etc).

Rudd-O commented 7 years ago

Maybe take a look at Go for that custom program. It's batteries-built-in, it's very efficient, it's a safe language. It's got what you need.

Look at what I wrote in it during the past few days: https://github.com/Rudd-O/curvetls .

Rudd-O commented 7 years ago

Come to think of it, the same crypto primitives I am using in the program above (Go's implementation of NaCL secretbox) can be used to seal disk image contents in tamper-resistant containers. You really should check it out — it doesn't have the cache / timing leaks that AES has, it's Salsa and Poly, very good stuff that has a number of implementations and is not known to be weak.

Presumably, the key you pass to secretbox.Seal would be the output of scrypt's hash function.

Nice, eh?

Rudd-O commented 7 years ago

I'm actually writing two demo programs to explain what I mean. Super simple, for you to read. Gimme 15 more minutes.

Rudd-O commented 7 years ago

There you go: brain dead simple:

https://github.com/Rudd-O/simple-nacl-crypto

The only remaining thing to do, is to write io.Reader and io.Writer that will "packetize" rows of zeroes (as sparse files are wont to contain) and package that data into the secret boxes. It's fairly easy to do, and the Go implementation of files allows seeking, thus it allows constructing sparse files on disk.

Rudd-O commented 7 years ago

Great news:

Though I have to go to sleep and I still need to put a few finishing touches on it, the encryptor and decryptor programs have evolved to pack zeroes (512-byte blocks to be accurate) in a run-length format (that should not be vulnerable to malicious manipulation, because verifiable encryption goes around it).

A file 1 GB in size reduces itself to about 21 bytes. And it's those 21 bytes that get encrypted. No need to do gzip or anything of the sort. Of course, this packing format can in principle be piped to gzip for compression as well.

Naturally, the decoder will use disk seeks to skip writing zeroes as it decodes. This will give us sparse files on decryption for free.

I will finish the decoder for this packing format later today. Right now I must sleep.

marmarek commented 7 years ago

Come to think of it, the same crypto primitives I am using in the program above (Go's implementation of NaCL secretbox) can be used to seal disk image contents in tamper-resistant containers. You really should check it out — it doesn't have the cache / timing leaks that AES has, it's Salsa and Poly, very good stuff that has a number of implementations and is not known to be weak.

@Rudd-O please don't. Writing our own program to handle crypto is the last thing we want to do, somewhere near "inventing our own crypto algorithm". Actually this very ticket is result of "being smart" and using openssl enc/openssl dgst directly, instead of some higher layer application, designed by experienced cryptographer.

Rudd-O commented 7 years ago

@marmarek I'm not "writing my own crypto". I'm merely writing a program that wraps well-tested cryptography. NaCL's secretbox is that higher layer crypto (a layer above enc + dgst, with proper authentication and handling of streams), designed by experienced cryptographers. My program only uses it.

Anyway, you should know that the point of these programs isn't to be used as full-fledged backup programs — they are meant as demos within a memory-safe language of (a) crypto box (b) sparse file handling. Much like scrypt enc and scrypt dec are meant to be demos of the scrypt hashing algorithm, and they are not meant to be fully-fledged encryption programs. I'm not going to expand them into backup programs.

marmarek commented 7 years ago

@Rudd-O But you are inventing own file format. The other problem is introducing new language into code base. While I see why you propose Go, we should stick to those currently used (Python, C). Otherwise maintaining and auditing it would be a nightmare (good luck finding skilled developer fluent in Go, Python, C, Ocaml, Ruby and whatnot). Please don't go offtopic here on advocating why Go is better (even when technically you may be right) - use mailing list for this.

Rudd-O commented 7 years ago

I understand. There seems to be a misunderstanding here.

I'm not saying "use this code as part of the backup for Qubes".

I'm saying that you should look at this a demo — demo, key word, I used it repeatedly — of how:

a crypto solution (not a primitive) like NaCL secretbox can be used to safely store files
a format for a backup can be made such that it is secure and not tamperable
a file can be packed as sparse almost effortlessly

Anyone is 100% free to look at how the code solves these three problems, and write a Python implementation of the same concepts. That can then be used in Qubes.

Note that you are free to use the code directly, if you later change your mind.

Edit: the demo project is done. It encrypts and decrypts files, packing and unpacking sparse files. The code won't let tampered files screw with the computer — security-critical properties are validated before data is parsed. I'm happy with how it turned out. See for yourself:

[user@machine simple-nacl-crypto]$ dd if=/dev/urandom bs=30M of=original count=1 seek=21+0 records in
1+0 records out
31457280 bytes (31 MB, 30 MiB) copied, 2.19849 s, 14.3 MB/s
[user@ring2-projects simple-nacl-crypto]$ encbufsize=1048576 ; decbufsize=1048576 ; make && (bin/nacl-crypt -s -b $encbufsize enc original encrypted abc ; (echo ---------- ; bin/nacl-crypt -b $decbufsize dec encrypted new abc ) ; echo ;  (md5sum original new ; ls -la original ; ls -la  encrypted ; ls -la new ; du original ; du encrypted ; du new ))
GOPATH=/home/user/Projects/simple-nacl-crypto go install github.com/Rudd-O/simple-nacl-crypto/cmd/`echo bin/nacl-crypt | sed 's|bin/||'`
----------

ae76a32ca5b75f4c4f276a2f08750bc7  original
ae76a32ca5b75f4c4f276a2f08750bc7  new
-rw-rw-r-- 1 user user 94371840 Sep 29 02:02 original
-rw-rw-r-- 1 user user 31458320 Sep 29 02:02 encrypted
-rw-rw-r-- 1 user user 94371840 Sep 29 02:02 new
30720   original
30724   encrypted
30724   new

marmarek commented 7 years ago

Here is draft of emergency backup restore v4, which is informal backup format specification. It uses tar for storing sparse files, but encrypted and integrity protected using scrypt utility.

Rudd-O commented 7 years ago

I dislike the use of tar, to be honest. tar takes forever when reading a file that has sparse sectors, because it has to read the entire file before actually beginning to spit out the data to the calling process. A utility that was written for the purpose, which doesn't have this problem, and made available on Github for emergency restore purposes, should be much better.

marmarek commented 7 years ago

I see. As for reading entire file - bsdtar does it better (for the same file format). But it works only for sparse files, not LVM. Not sure how it works on btrfs. In fact I think it is impossible to effectively get LVM thin volume content (without reading it all). But if possible, it can be implemented in our tool. And also - as tar tool can't get block device content, I've written simply python script for it: https://github.com/marmarek/qubes-core-admin/blob/core3-backup/qubes/tarwriter.py Extraction (either normal, or emergency) is still handled by standard tool. As discussed in this year+ long thread, the best compromise for encryption + integrity protection is to use scrypt tool, I don't want to reinvent anything here.

Rudd-O commented 7 years ago

On 10/11/2016 10:32 PM, Marek Marczykowski-Górecki wrote:

I see. As for reading entire file - |bsdtar| does it better (for the same file format). But it works only for sparse /files/, not LVM. Not sure how it works on btrfs. In fact I think it is impossible to effectively get LVM thin volume content (without reading it all). But if possible, it can be implemented in our tool. And also - as |tar| /tool/ can't get block device content, I've written simply python script for it: https://github.com/marmarek/qubes-core-admin/blob/core3-backup/qubes/tarwriter.py Extraction (either normal, or emergency) is still handled by standard tool. As discussed in this year+ long thread, the best compromise for encryption + integrity protection is to use |scrypt| tool, I don't want to reinvent anything here.

Qubes OS should also not be using LVM AT ALL. There are no data integrity guarantees with it.

If Qubes OS used btrfs, for example, efficient clones of VMs would be trivial, cp --reflink would work, and FIEMAP (discovery of holes in VM images) would also be implementable.

tar still sucks. The file needs to be read whole because the format requires it upfront.

Scrypt is fine. It's effectively the same thing I am doing with the demo program that I posted above, except it doesn't handle sparse files.

Rudd-O
http://rudd-o.com/

QubesOS / qubes-issues