openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.44k stars 1.73k forks source link

ZFS Crypto support #494

Closed FransUrbo closed 7 years ago

FransUrbo commented 12 years ago

As of ZFS Pool Version 30, there is support for encryption. This part is unfortunatly closed source, so an opensource implementation would be required. That means it would probably not be compatible with the Solaris version 'but who cares' :).

Illumos is apparently working on this at https://www.illumos.org/projects/zfs-crypto. Source repository can be found at https://bitbucket.org/buffyg/illumos-zfs-crypto. Unfortunatly there is no changes since the fork from illumos-gate. Should ZoL start thinking about this or should we just take the back seat?

Don't know how big of a problem this would be, but 'copying' the way that LUKS (Linux Unified Key Setup) do it seems to be a good place to start.

behlendorf commented 12 years ago

I'd like to hold off on this for the moment, we have enough other work on our plate and this is a huge change! If Illumos puts together an implementation we'll happily look at integrating it. We would could even use the source from ZFS Pool Version 30 if Oracle decides to release the source 6-12 months from now (unlikely but possible).

FransUrbo commented 12 years ago

If you don't mind LUKS, I might have some time to look at this in a week or two.

behlendorf commented 12 years ago

I'm OK with making it easier to layer zfs on top of LUKS, that would be nice. It's just not what most people think of when they say zfs encryption support.

FransUrbo commented 12 years ago

I was rather thinking of 'cloning'/'copying' the way LUKS works. Or rather, use the LUKS API inside ZFS. LUKS is used by 'cryptsetup' (configures encrypted block devices) and 'dmsetup' (The Linux Kernel Device Mapper userspace library). So it seems LUKS is an API for device encryption.

Using ZFS 'on top of' something like that would probably be easier, but not, as you say, not the intention...

behlendorf commented 12 years ago

It would be interesting to investigate if what your suggesting is possible. It would result in a second version of zfs encryption which isn't compatible with the pool v30 version but that might not be a big deal. We should be integrating the new feature flag support early next year so it could end up as a Linux-only feature.

FransUrbo commented 12 years ago

I don't think we'll ever going to be compatible with v30... Not any of us, not unless Oracle all of a sudden 'sees the light', and I'm not holding my breath on that! :)

Best would be if we could come up with a solution, that would be portable to other OS'es. Don't know how much Linux the 'Linux Unified Key Setup' is, but it's worth a look. I'll start that once I have a workable sharesmb patch.

baryluk commented 12 years ago

How about at least reverse engineering v30 format?

FransUrbo commented 12 years ago

Be my guest! Reverse engineering something, especially a crypt algorithm isn't any where near as simple as it sounds!

baryluk commented 12 years ago

We know it is using SHA-256, and AES-128 with Incremental mode probably, so actually there is nothing complicated, only some on-disk meta-data needs to be reverse enginered, like which bit is what, and where is salt, and where is stored information that it is AES-128 and not 192 or 256. It should be easy. Unfortunately I do not have access to Solaris right now to test it.

FransUrbo commented 12 years ago

That DO sound easy :). Unfortunatly, we probably have to... I've spent the day looking into LUKS, but it does not seem to fit the purpose :(.

It is intended for being placed between the device and the FS. Which means it needs one device (either one physical disk or multiple disks presented as one through raid/md) where it can store data linearly... Kind of. But since ZFS is both a FS and a ... 'device mapper' (?) which have multiple devices, I doubt it will be possible to have LUKS split the data and it's key storage partitions split over multiple physical disk. I haven't looked at the code yet, just the specs but that's what it looks like so far.

patrykk commented 12 years ago

Hi, "Oracle Solaris 11 Kernel Source-Code Leaked" more information:

http://www.phoronix.com/scan.php?page=news_item&px=MTAzMDE

akorn commented 12 years ago

Of course, you shouldn't look at the leaked source if you work on ZFS lest Oracle accuse you of copyright infringement.

patrykk commented 12 years ago

Yes, You are right.

baryluk commented 12 years ago

LUKS is not an option. ZFS performs encryption on per-dataset/volume/file basis, LUKS works on device level. We already have crypto primitives available in kernel, we already have on-disk format designed, we just need to reverse enginer it (it should be slightly easier than designing it - which in case of crypto-stuff is hard to do properly/securely). Probably ZIL will be the hardest part.

Of course looking at leaked source-code is not an option at all. Even for second I wasn't thinking about it.

dajhorn commented 12 years ago

An interim solution is ecryptfs, which can be installed on top of ZFS.

Most RPM and DEB systems have built-in management for ecryptfs, which makes it easy to configure.

For maximum performance, dedup and compression should be disabled on any ZFS dataset that hosts a crypto layer.

atoponce commented 12 years ago

http://src.opensolaris.org/source/xref/zfs-crypto

pyavdr commented 12 years ago

This ( http://src.opensolaris.org/source/xref/zfs-crypto ) looks very nice, CDDL and lots of zfs crypto stuff. Maybe we should try to cooperate with Illumous for a common port to linux. In any case processors with AES-NI should be supported to gain optimal performance.

maxximino commented 12 years ago

I would like to point out that the code linked above has ZPL_VERSION = 3 and SPA_VERSION=15. That's quite old!! (source: http://src.opensolaris.org/source/xref/zfs-crypto/gate/usr/src/uts/common/sys/fs/zfs.h#318) Oracle didn't merge that code until version 30 (http://hub.opensolaris.org/bin/view/Community+Group+zfs/30).

behlendorf commented 12 years ago

We should certainly work with the other ZFS implementations when any crypto work is being considered. Also it's my understanding that the link your referencing is to some of the early crypto work and it has been significantly reworked before being include in v30. That said, it's still probably a reasonable place to get familiar with the basic design decisions.

ryao commented 12 years ago

Here is Sun's design document for ZFS encryption support:

http://hub.opensolaris.org/bin/download/Project+zfs-crypto/files/zfs-crypto-design.pdf

We can check out the early code by doing hg clone ssh://anon@hg.opensolaris.org//hg/zfs-crypto/gate.

mcr-ksh commented 11 years ago

I would love to see that as well. crypto is an amazing feature.

FlorianFranzen commented 11 years ago

The last post was 5 month ago. Did you guys decide on anything? What is the current state?

gua1 commented 11 years ago

@FloFra This is marked "Milestone: 1.0.0" and I think the zfsonlinux developers hope that illumos would have implemented crypto support by then or I suppose if illumos hasn't then they would work on it. My interpretation is "to be done in the distant future".

grahamperrin commented 11 years ago

In the ZFS on Linux area

https://groups.google.com/a/zfsonlinux.org/forum/?fromgroups#!searchin/zfs-discuss/crypto

https://groups.google.com/a/zfsonlinux.org/forum/?fromgroups#!searchin/zfs-devel/crypto leads to pool version 33, zfs-crypto (2011-12-22).

In the illumos area

Whilst https://www.illumos.org/projects/zfs-crypto is not recently updated, there's http://wiki.illumos.org/display/illumos/Project+Ideas (2012-07-18)

Device drivers

Niagra Crypto … Re-implement the crypto acceleration drivers for the SPARC sun4v cpus.

File systems

ZFS encryption … Import and update the work started by Darren Moffat to provide cryptographic support for ZFS.

I'll align myself with the latter.

Elsewhere

In irc://irc.freenode.net/#zfs on 2013-11-09, someone attention to code on GitHub. We acknowledged the need for someone to audit that code, so I didn't follow the link.

ryao commented 11 years ago

@grahamperrin mentioned some encryption code on github. It was determined on the mailing list that it includes code from the Solaris 11 leak and is therefore encumbered. We will not be using it.

sempervictus commented 10 years ago

I believe the encryption code referred to is located at https://github.com/zfsrogue/zfs-crypto. I've been able to merge and build both the SPL changes (https://github.com/zfsrogue/spl-crypto) and the ZFS branch. Doesnt merge well with the pending ARC cache changes though so not being tested on my current labrats. I'll post results once i free up a test cycle

FransUrbo commented 10 years ago

@sempervictus I would be very, very careful using that code (IF you can get it to merge). There's a very, very (yes, yes! :) high risk that that code is the source of my loss of my pool (16TB almost full).... We have not been able to absolutly pinpoint exactly why, but I've been running that code on my live server for almost a year, upgrading as I went and somehow, somewhere it (the crypto code) messed up the meta data so I/we might be unable to fix it...

FransUrbo commented 10 years ago

Also, Rougue isn't really maintaining the code any more (ZoL have gone through a lot of changes since he created the repository).

lundman commented 10 years ago

Actually he is, updating the osx one when I ask etc. He's most likely waiting on 0.6.3 to tag and release.

You can ask him to do a merge anytime if there is something you want sooner.

FransUrbo commented 10 years ago

@lundman All I've seen is that he have 'come in' once a month (or 'every now and then') accepting patches, sometimes without any review. There was a couple of pulls I wanted to discuss before merge (they required other ZoL pulls I did to be accepted first, which they weren't/haven't yet - and might not ever be). And after the core-update of ZoL (some Illumos merge), the update wasn't really up to snuff. Considering how fast ZoL is moving right now, I think it's important that he keeps a much closer eye on what's happening, not just accepting others pulls willy-nilly. And basically, that's all he does now... Don't misunderstand me, I want his code as much as the next guy, but the core issue is that that code somehow, somewhere messed up my pool. My take of all that is that it's because he isn't maintaining it properly. It (I) doesn't mean that the code shouldn't be considered, just that care should be taken before using it...

lundman commented 10 years ago

You are always running a risk pulling from master, in any repo. The master is the edge of development, he tags releases that are considered stable. It is appreciated that you are brave enough to help debug master of course. But even main ZOL made pool incompatibilities, so you run the risk there too :) He only pulls from ZOL, not really "patches willy nilly" You and I might be the only pull requests there are.

Have you considered it is up to us to maintain it, he's just hosting it to ensure the "sights" are off us open source developers.

FransUrbo commented 10 years ago

Fair enough, and I'm not blaming him (or you) in any way for my current predicament - I'm well aware that I take a chance every time I use code that isn't fully testing. I'm just saying that anyone should be very, very careful using it, because I (might - ohhh, I really hope it's "might" and not "have" :) lost my pool and you haven't. I blame myself for not being careful enough. You and I might also be the only ones actually using and testing it throughly :). I don't know enough about the core code (of either ZoL or ZFS-Crypto) to be able to help in that part. All I can do is submit issues and some documentation and simpler pull requests. Rogue might want to wait for 0.6.3, but considering how fast ZoL is moving, a much closer eye is needed. And if he's not maintaining it, will you? I just don't have the know-how to do it... Someone needs to keep it up to date with ZoL, not just around every tagging of ZoL... And 'someone' also needs to figure out why exactly I lost my pool, so that it can't happen again to someone else. Might be to late for me (I really hope not), but if it can happen to me, it is very possible that it can happen again...

sempervictus commented 10 years ago

Having done some basic digging around the issue, it looks like there are many people offering suggestions for how to get going on this, but everyone's punting to see who jumps off this cliff first. @FransUrbo: agreed 100% for any production/client-oriented use. I've left a comment in the issues section about merging in current code + the ABD patches, it looks like we're plain missing sha-mac here. I've been doing my own merges much like you, getting a decent sense of the logic flow... @devs: basic crypto implementation, especially compatible with Solaris would be very useful from a "marketing" standpoint - to get traction we need better corporate penetration, and most companies dont want to go into new storage ventures without the crypto box checked off. I've been giving folks ZFS/dm-crypt, but its not ideal to work with an intermediate block layer. If a slow-as-hell but stable compatibility layer is implemented, illumos, here, wherever, it would at least give clients seeking to use specific platform features of non Solaris OS' a way to maintain consistent data stores. @all: has anyone asked Oracle directly what their litigious stance on this code-leak mess is? Companies (Google) have been known to swear off present/future litigation for derived works, and if there's any chance of that here, it would be of significant help in terms of maintaining an inter-operable data format.

FransUrbo commented 10 years ago

Having done some basic digging around the issue, it looks like there are many people offering suggestions for how to get going on this, but everyone's punting to see who jumps off this cliff first. I think it's more of the fact that there's really not anyone competent enough to start it.

Dabbling with encryption is hard and difficult enough, but designing it!? Very few people is knowledgable enough to dare to do this. And if there is such a person, he/she would need extensive knowledge on ZFS as well...

I seriously doubt that there is such a person, and if there is, this person is obviously way to busy at the moment with other things (like making sure the open source version of ZFS - any os/dist - is stable and functioning properly).

I've been doing my own merges much like you, getting a decent sense of the logic flow...

Do realize that you've just cut yourself of the 'True Open Source' version of crypto in ZFS. @all: has anyone asked Oracle directly what their litigious stance on this code-leak mess is?

Doubt it. Feel free to offer yourself up on their altar :)

Asking doesn't cost anything, but time. I'm just to convinced on the answer to bother...

Companies (Google) have been known to swear off present/future litigation for derived works

Yes, but they on the other hand have a company statement that literally say 'Do No Evil'. They also have quite a good and long track record of working with/for Open Source.

Oracle don't... :). They on the other hand have proved that their 'secret' company statement must be 'Do As Much Evil As Possible'... (no smily on that!)

lundman commented 10 years ago

Large part of the Sun ZFS-Crypto work is in the somewhat advanced kernel key store. Where you can rekey your dataset at any time, and it handles that work for you. I would advocate not being Solaris compatible, you wont be able to import pool v30 without handling hybrid v29 first anyway.

I had an informal chat with ZFS dev at last conference, and the idea of having a DMU layer encryption would be a good start. Something you could probably knock out in a couple of days, like at hackathon. Perhaps that could be suggested at the next Open ZFS Day. Although I wrote the Solaris kernel crypto API layer for Linux SPL, I don't think I have the skill to design a new DMU layer.

The ABD work is a bit of a porting hassle, mainly as it is a linux only feature and really should be in SPL layer (if it can) as it diverges greatly. It is also strange they add new scatterlist, when ZFS's built in UIOs already are, and are supported in SPL. Even the OS X is stuck on this work, the future is uncertain for using ZOL as upstream. But it's not like we even asked for permission before using them as upstream ;)

behlendorf commented 10 years ago

@lundman It's very likely some form of the ABD changes will be going back upstream to OpenZFS. The idea is to do this in as compatibile a way as possible so if you have concerns please post them in the pull request.

lundman commented 10 years ago

Ah if they are, I will get on another merge here. It was actually nothing difficult (as I assumed), bar the unexpected smp_rmb() smp_wmb() and strdup() (not spa_strdup). I withdraw my rant :)

lundman commented 10 years ago

Looks like rogue did another merge two days back, no surprises there, looks like it went normal. As for ADB specifically, I will wait for them to be in master before we tackle the merge.

ryao commented 9 years ago

Here are some links that might be useful for this:

https://hg.openindiana.org/upstream/oracle/onnv-gate-zfscrypto/ https://blogs.oracle.com/darren/entry/zfs_encryption_what_is_on http://www.oracle.com/technetwork/articles/servers-storage-admin/manage-zfs-encryption-1715034.html https://docs.oracle.com/cd/E26502_01/html/E29007/gkkih.html http://www.snia.org/sites/default/files2/sdc_archives/2008_presentations/wednesday/DarrenMoffat_ZFSEncryption.pdf https://www.usenix.org/legacy/events/fast09/wips_posters/moffat_wip.pdf

In particular, the GRUB2 souce code has a comment saying that its encryption support was implemented mostly using the above blog post as a reference. In addition, the GRUB2 source code shows that bit 62 is used to indicate a block pointer to an encrypted block and that the hidden dataset property is called "salt".

Upon some poking around, it seems that the encryption property is inheritable, but not editable. send streams of encrypted filesystems are not encrypted, but recving to an non-encrypted filesystem is not permitted. Also, the buffers from encrypted datasets are not written to L2ARC. Each dataset has a key chain in ZAP that stores the actual keys used for encryption along with information on which algorithm was used, which txgs were in effect at the time, etcetera. Decryption requires figuring out which key was used for a given txg while encryption requires using the latest. You can read more details at that blog post.

ryao commented 9 years ago

Here are some more documents that might be useful:

http://docs.oracle.com/cd/E26502_01/html/E29007/gkkih.html http://www.oracle.com/technetwork/articles/servers-storage-admin/solaris-zfs-encryption-2242161.html http://docs.oracle.com/cd/E23824_01/html/821-1462/zfs-encrypt-1m.html http://wenku.baidu.com/view/52a44c24a5e9856a5612609b.html

The last two are the Solaris man page and original OpenSolaris design document respectively.

ryao commented 9 years ago

Some discussion of full disk encryption in #gentoo-dev on freenode lead me to poke around. I wrote an outline of the disk format changes mostly via header changes here:

https://github.com/ryao/zfs/tree/crypto

ryao commented 9 years ago

The Linux encryption API is GPL-exported, so we cannot use it without having a fairly unpleasant conversation about whether the GPL-exported symbol restriction is correct. The OpenBSD/FreeBSD AES code is "public domain". We do not have the rights to public domain code in jurisdictions that do not recognize public domain, such as France, so we cannot use that either. The OpenSSL AES code is under the "OpenSSL License", which is a 6-clause variant of the 4-clause BSD license, which is problematic.

That said, the Illumos AES code is under the CDDL, so anyone implementing encryption support should have no issue reusing it in ZoL:

https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/crypto/io/aes.c

ryao commented 9 years ago

To elaborate on my previous comment, implementing this would require porting the Illumos kernel cryptographic library to a Linux kernel module so that we can include the sys/crypto/api.h header:

https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/sys/crypto/api.h

That code is also able to be built in userland, which we would need to preserve so that zdb and ztest would be able to link to it.

Also, some of the code in the following files could likely be reused with modifications:

https://hg.openindiana.org/upstream/oracle/onnv-gate-zfscrypto/file/00e02583c3c3/usr/src/uts/common/fs/zfs/sys/zio_crypt.h https://hg.openindiana.org/upstream/oracle/onnv-gate-zfscrypto/file/00e02583c3c3/usr/src/uts/common/fs/zfs/zio_crypt.c

For instance, CBC mode was replaced with GCM mode and keydatalen == 24 (192-bit) is supported. Also, the on-disk per-dataset keychain was added after OpenSolaris was killed.

lundman commented 9 years ago

The code I did to help rogue, which is only in SPL, does the first step to implementing IllumOS crypto API in Linux kernel. It can be whatever license you prefer, since I wrote it. Although I remember at the time, the Linux crypto call for scatterwalk was wrong, so I had to copy in the linux source files, just to comment out a couple of calls. The ZFS_COPYDST comments can be ignored, as having own copy avoid all that. This should possibly be dusted off again to see if Linux has fixed it.

All the sources in SPL are in no way tainted by the Solaris sources rogue used.

In the SPL code, I convert the IllumOS UIO list of buffers (1 or 2 buffers are used in crypto) and create an equivalent Linux Scatterlist, to be used with the Linux crypto API calls. (crypto_map_buffers())

I also ported ZOL spl-crypto to FreeBSD, but since FreeBSD is missing a crypto API, I had to include the AES sources as well. Similarly, OSX's spl-crypto is a copy from FreeBSD. Those two could be improved by using optimised (or using OSX new crypto kernel API).

But if we are dusting this off and doing own crypto (and we should) we might consider doing our own thing and not be compatible with Oracle pool v30. Just a thought.

https://github.com/zfsrogue/spl-crypto/tree/master/module/spl https://github.com/zfsrogue/spl-crypto/blob/master/module/spl/spl-crypto.c https://github.com/zfsrogue/freebsd-crypto/blob/master/sys/cddl/compat/opensolaris/kern/opensolaris_crypto.c https://github.com/zfsrogue/osx-spl-crypto/blob/master/module/spl/spl-crypto.c

ryao commented 9 years ago

@lundman The Gentoo licensing team has instructed me to avoid touching that code. It cannot be reused without killing the first-party Gentoo ZoL packaging.

This is not on my list of things to do for work, so I make no promises of implementing it. I am casually researching this because the prospect of evil-maid attacks made me interested in it.

That said, I have an idea (from a user request in #zfsonlinux on freenode) for a step beyond v30 encryption. In specific, trusted boot. The idea would be to PGP sign each uberblock entry and ZIL block with public/private key cryptography. The key pair would be stored encrypted within the pool while the public key would be stored inside the kernel/initramfs. The boot process would then verify the uberblock entries and ZIL keys using its copy of the public key. Doing this would allow us to verify that the entire pool has not been subject to tampering. This is technically a separate idea, but I would prefer it to come after we can ensure confidentiality.

lundman commented 9 years ago

spl-crypto.c I wrote, so it can have whatever license you, or Gentoo, wants. As for the ccm.c, gcm.c and ctr.c, those were copies from Linux at the time. A test should be done to see if linux fixed those problems I encountered, but even if they have not, fresh files can be copied. Those exact files are not needed. The header files are from IllumOS github.

ryao commented 9 years ago

@lundman The Gentoo licensing team explicitly told me that we cannot use spl-crypto.c because it was written against the leaked encryption code. I am not even allowed to look at it according to them. It violates the notion of doing clean room reverse-engineering.

lundman commented 9 years ago

No, spl-crypto has nothing to do with leaked encryption code. I wanted to be able to call IllumOS crypto API from ZOL. That required implementing crypto_mech2id() and so on. But ultimately, Linux will do whatever Linux wants, even if it doesn't make sense. Good luck.

ryao commented 9 years ago

@lundman Does that code use GPL-exported symbols?

Also, WRT to encryption support, we should probably zero memory containing encryption keys during the shutdown process to provide some protection against cold boot attacks.

sempervictus commented 9 years ago

@ryao: given the information @lundman provided on the provenance of the SPL components, is it feasible to push that back to the Gentoo licensing team for re-evaluation? Given their job description, if they believe the code to be tainted, its not too likely they themselves have read through it, or done the requisite validations to ascertain that it was or was not built atop "tainted" code. 100% with you on preventing cold boot and other memory forensic methods of key extraction, and would probably go a step further to suggest we look at something like Tresor to keep clear-text keys out of main memory altogether. If thats infeasible for performance/compatibility reasons, we might look at binary encoding techniques similar to those used by red teams for payload delivery to hide out in system memory with appropriate jumps to decoders and key segments built into the encoded memory space (would need to be in an executable region or bypass NX). If the key stubs are instantiated during pool import, then we should end up with a reasonably high entropy of possible encoded key+decoder blobs for the same keys by utilizing polymorphic encoders/decoders, making attacks designed to extract and implement those keys much harder. If we were to stick to the zero-out approach, we'd want to do it at export, and probably still use transient keys during the kernel's interactions with a pool. The ship for v30 compatible encryption has likely sailed, and sunk to the depths of the Marianas Trench - current production workloads using crypto have largely migrated into Oracle's codebase (at least the ones we see), and the extra effort for backward compatibility may help only a small percentage of the hardcore OSolaris users still running v30 pools in their workshops (prying them away from opensolaris is like trying to take a fresh caribou from a hungry lion, they resist, with vigor). OpenZFS approaches however will need to be cross platform, and what @lundman suggests sounds (to me anyway) like a massive move toward providing the foundation to implement an encryption scheme atop OpenZFS. Since Illumos is driving development of ZFS, we should at least create interfaces compatible with theirs and returning the same (or compatible) data types for cryptographic functionality. Sounds like that's what @lundman has already done, and that would not touch CDDL labled code released by Oracle as it comes from Illumos.

On Mon, Aug 24, 2015 at 2:10 AM, Richard Yao notifications@github.com wrote:

@lundman https://github.com/lundman Does that code use GPL-exported symbols?

Also, WRT to encryption support, we should probably zero memory containing encryption keys during the shutdown process to provide some protection against cold boot attacks.

— Reply to this email directly or view it on GitHub https://github.com/zfsonlinux/zfs/issues/494#issuecomment-134050964.