xcp-ng / xcp

Entry point for issues and wiki. Also contains some scripts and sources.
https://xcp-ng.org
1.3k stars 74 forks source link

Support full system encryption / encrypted storage #463

Open fetzerms opened 3 years ago

fetzerms commented 3 years ago

Vision: In sensible environments where encryption for all data is required, it would be very handy to have a full disk encrypted xcp-ng install. Preferably with remote-unlocking capabilities.

This Idea requires two kinds of encryption:

Current state:

Desired state:

Ideas:

What do you guys think?

stormi commented 3 years ago

I don't know all the technical details, but I think this wouldn't be trivial to do.

I think an encrypted storage repository would be something doable, probably at a significant performance cost. What additional security would it bring over encrypting VM disks themselves?

About system encryption, I'm not entirely sure what sensitive data dom0 would contain, since all it manages is VMs and resources. Actual data is in the VMs.

fetzerms commented 3 years ago

You might be right about dom0. I was thinking about access logs, bash history and other things that might contain sensitive info. Often a "everything is encrypted to be safe"-approach is preferred.

About encryption of storage: This would allow VMs to be encrypted, that do not offer some kind of encryption for their OS. Be it some old DOS-VM or some custom OS. Furthermore, the storage repository only needs to be unlocked once and all the VMs can automatically boot/reboot without worrying about encryption anymore. So its completely transparent for the guest OS.

The performance penalty is there for sure. I am using cryptsetup to do this and it works pretty nice.

stormi commented 3 years ago

As long as you can create an encrypted filesystem, you should be able to use it as a storage repository using the file SR type, or any more appropriate SR type if exists (such as zfs for... ZFS obviously). I don't know if it's been already tested, and what the performances would be.

If you're using a shared storage such as NFS, I suppose you could very well encrypt everything directly on the file server, too.

nagilum99 commented 3 years ago

@stormi: The costs should be minimal these days. It adds a bit of latency but throughput is above all common storage devices. You just need to take care of AES-NI or similar support from Intel/AMD - they en-/decrypt several GB/s without blocking too much CPU ressources.

stormi commented 3 years ago

Thanks for the insight.

Update: However I always more or less expect to find out something we hadn't foreseen when used in the context of virtualization :)

fetzerms commented 3 years ago

As long as you can create an encrypted filesystem, you should be able to use it as a storage repository using the file SR type, or any more appropriate SR type if exists (such as zfs for... ZFS obviously). I don't know if it's been already tested, and what the performances would be.

If you're using a shared storage such as NFS, I suppose you could very well encrypt everything directly on the file server, too.

Currently, we can already do this by hand. I encrypt my local drive with cryptsetup and set up a LVM storage repo on top of it. The performance looks fine to me, but I did not do any benchmarks. But as this is something that is not supported, I fear that one day it might stop working.

It would be handy to be able to set up and manage encrypted SRs directly with xcp-ng (and through xoa / xcp-ng center). One step further would be to have some sort of KMS support, like VMware does. But that is something for the future.

About system encryption: I am not sure what needs to be done to "transform" a CentOS-Install into xcp-ng. But having a fully encrypted CentOS install is quite straight forward. In the area of xcp-ng, it gets more complicated withe update iso etc. For yum-style of upgrading it shouldnt be too complicated.

olivierlambert commented 3 years ago

A good recap: it might be easy to setup once, but then it's really hard to manage everything around (keys, ISO upgrade etc.)

If I wanted to go that route, I would:

  1. Modify the installer to allow encrypted XCP-ng during install
  2. Modify the upgrader and enter the key to be able to make the upgrade
  3. Key management in XAPI exposed in XO
  4. How to deal with decrypt on boot?

That's a lot of work, but fortunately it's a community project, contributions are really welcome!

rjt commented 3 years ago

Would XOSAN have mechanisms that make key management across the xen hosts any easier?

Does CEPH?

On Sat, Dec 12, 2020 at 3:07 AM Olivier Lambert notifications@github.com wrote:

A good recap: it might be easy to setup once, but then it's really hard to manage everything around (keys, ISO upgrade etc.)

If I wanted to go that route, I would:

  1. Modify the installer to allow encrypted XCP-ng during install
  2. Modify the upgrader and enter the key to be able to make the upgrade
  3. Key management in XAPI exposed in XO
  4. How to deal with decrypt on boot?

That's a lot of work, but fortunately it's a community project, contributions are really welcome!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/xcp-ng/xcp/issues/463#issuecomment-743727372, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACX7F4TZRBDLGFMF7ZZ6PDSUMXE7ANCNFSM4USBLNRQ .

olivierlambert commented 3 years ago

I can't see how it's connected to our questions here, but maybe I missed something?

fetzerms commented 3 years ago

Thank you @olivierlambert for summarizing what needs to be done. I think encrypted storage could be added relatively easy, as I am currently doing it by hand, using cryptsetup. Encrypted storage does not really interfere with updates/upgrades etc. But (currently) needs to be unlocked remotely via ssh.

olivierlambert commented 3 years ago

@fetzerms so at each boot, you need to connect to your host, unlock the SR, and "reconnect" it (since it couldn't be mounted without the passwd). Is that right? Otherwise, feel free to explain your current process :+1:

nagilum99 commented 3 years ago

Ideally you work with some auth service on the hosts that could unlock the upcomming server, as long as it belongs to the pool.

fetzerms commented 3 years ago

@fetzerms so at each boot, you need to connect to your host, unlock the SR, and "reconnect" it (since it couldn't be mounted without the passwd). Is that right? Otherwise, feel free to explain your current process

Yes exactly. Actually my setup is as follows:

The steps from the key server could also be done manually.

*) just some server which stores the keys and has ssh keys to connect to the xcp-ng instances.

DSJ2 commented 3 years ago

Have you looked at clevis and tang?

fetzerms commented 3 years ago

@DSJ2 thanks, thats a very good idea. I actually never heard about clevis and tang before.

olivierlambert commented 3 years ago

@fetzerms feel free to share the results of your experiments! If your work can be streamlined/automated/integrated in XCP-ng, we'll be happy to assist :+1:

TylerDurden2019 commented 3 years ago

@fetzerms Do you have step by step instructions to encrypt local disks with cryptsetup and set up LVM storage on top of cryptsetup? I'm interested in trying this on XCP-ng 8.2. Thanks.

fetzerms commented 3 years ago

@TylerDurden2019: Sorry for my little late response. I intended to do some proper write up, but I am currently really lacking time and/or motivation. Hence, the following steps somehow give a brief walkthrough, but do not explain anything in depth.

First time setup:

1. Make sure that your local drive is not hosting a SR. Deactivate and delete the SR from xcp-ng. I suggest to also use wipefs on the drive.
2. yum install cryptsetup # I think its now pre-installed, I just checked my old scripts...
3. cryptsetup luksFormat /dev/your/local/disk
4. cryptsetup luksOpen /dev/your/local/disk data
5. xe sr-create host-uuid=<uuid> content-type=user device-config:device=/dev/mapper/data name-label="Encrypted_SR" shared=false type=lvm

Then you are done and should see the SR in xcp-ng.

After reboot:

1. cryptsetup luksOpen /dev/your/local/disk data
2. pvscan && lvscan && vgscan && vgchange -ay --config global{metadata_read_only=0}
3. xe-toolstack-restart

In addition to this, I also create a LV after creating the SR and mount it as /var/log after rebooting.

TylerDurden2019 commented 3 years ago

@fetzerms Thanks for writing that up. It's helpful, appreciated.

@DSJ2 Have you used clevis and tang with XCP-ng? Would you have any step by step instructions to set that up? I'm currently following the instructions here https://wiki.dev0.sh/books/homelab/page/encrypted-sr which uses a USB key to automatically unlock the LUKS volume but it defeats the purpose of encrypting it when the key easily accessible so I want to try clevis and tang or other methods.

TylerDurden2019 commented 3 years ago

I've tested out using Clevis and Tang server for automatical unlocking the encrypted local SR on XCP-ng 8.2 Following on from @fetzerms 's post above, here is what I did.

Tang Server Setup

Install more than one Tang server on multiple VMs for redundancy if needed.

Install Ubuntu and get latest updates

sudo apt update sudo apt upgrade

Install Tang

sudo apt install tang

Set Tang to auto start on boot

sudo systemctl enable tangd.socket --now NOTE: The service doesn start automatically due to a bug that's supposed to be fixed in tang-7-5. The workaround is to comment out the lines that start with "After=" in /usr/lib/systemd/system/tangd.socket as suggested in https://bugzilla.redhat.com/show_bug.cgi?id=1745177

Show the Tang server keys

tang-show-keys

Show Tang logs on Ubuntu

tail -f /var/log/syslog

Clevis Client Setup on XCP-ng 8.2

Install Clevis

yum --enablerepo=base install clevis-dracut

Enable Clevis to automatically unlock a non-root crypttab partition at boot time using a Tang server.

systemctl enable clevis-luks-askpass.path

Get the luks UUID for the encrypted device

cryptsetup luksUUID /dev/your/local/disk xxxxxxx-0fa-4fba-a274-XXXXXXXXXXX

Edit and add to /etc/crypttab

vi /etc/crypttab crypt0 UUID=xxxxxxx-0fa-4fba-a274-XXXXXXXXXXX none _netdev

Example 1

Add tang server using SSS for the device in luks with threshold of 1, which means one of the listed Tang server must be online to unlock the volume. clevis luks bind -d /dev/your/local/disk sss '{"t": 1, "pins": {"tang": [{"url": "http://10.0.1.2"}, {"url": "http://10.0.1.3"}]}}'

Example 2

Add tang server using SSS for the device in the specific luks Slot 2 with threshold of 2, which means two of the listed Tang server must be online to unlock the volume. clevis luks bind -s 2 -d /dev/your/local/disk sss '{"t": 2, "pins": {"tang": [{"url": "http://10.0.1.2"}, {"url": "http://10.0.1.3"}]}}'

Other useful commands:

Check Luks metadata and information

luksmeta show -d /dev/your/local/disk cryptsetup luksDump /dev/your/local/disk

Remove Luks metadata in slot 2

cryptsetup luksKillSlot /dev/your/local/disk 2 luksmeta wipe -d /dev/your/local/disk -s 2

sonoracomm commented 2 years ago

No criticism of any sort is being implied here. I promise. I'm just stating a desire.

While all the information above is really helpful, and I may test @TylerDurden2019's howto, I really think this situation needs to be implemented from above...built into XCP-ng...for enterprise reliability.

As a user, interested in reliability first, I am truly not interested in customizing XCP-ng. It just doesn't sound like a good idea to me.

Is there a 'bounty' for this possible new feature in XCP-ng? I can't afford much, but I would pledge a few bucks.

For systems with shared storage, I doubt this is much of an issue.

However, small shops with few XCP-ng servers and no shared storage could REALLY benefit from this functionality. I need this for a couple of SMB clients who have Windows Server VMs and a regulatory burden requiring encrypted storage.

Microsoft is not overly helpful in this situation either.

I feel there is a definite use case for local, encrypted VM storage.

I think that some sort of network unlock would be very important. If the XCP-ng server gets stolen, we need to make sure the data is unreadable, so (as previously mentioned) a USB key or floppy or VHD is not sufficient.

Thank you so much to Oliver, Vates and all contributors for this fantastic XCP-ng project!

G

olivierlambert commented 2 years ago

Hello @sonoracomm and thanks for your feedback :+1:

This sounds reasonable for a new driver on top of SMAPIv3. However, this really must come after SMAPIv3, because implementing stuff on legacy storage stack will be never merged upstream anyway.

So SMAPIv3 + one encryption driver, represents something like 5 man-year, so we are easily around half a million euro. It's not that I want to refuse any money, but this is only possible with companies/industries pushing for it. We'll do SMAPIv3 anyway, but the pace will be also depending on commercial success and priorities (as we are fully independent from big vendors)

sonoracomm commented 2 years ago

Thanks much for the status update and explanation, @olivierlambert.

I did not understand the complications or scope!

G

olivierlambert commented 2 years ago

Note that SMAPIv3 is a big priority for us this year, if we succeed (at least with partial features) we might try to make an encrypted driver (but we'll probably won't have snapshots, backup, live migration and so on).

fefe79 commented 2 years ago

Anyone knows or already succeed with tang/clevis and provide not just the key but the detached header file too in any way shape or form or perhaps to provide the detached header from a local workstation using ssh?

I tried the below using ZSH & BASH too to provide the detached header using the command substitution below "<(cat).....", :

# ssh -t my.xcpng.localdomain cryptsetup luksOpen /dev/nvme0n1p7 Encrypted_SR --header=<(cat) SR_luks_header.img < ~/SR_luks_header.img

however cryptsetup doesn't like it as it expect either a device or file, so for now it just gives an error:

Device /dev/fd/11 doesn't exist or access denied.

dngray commented 2 years ago

SMAPIv3 is a big priority for us this year, if we succeed (at least with partial features) we might try to make an encrypted driver (but we'll probably won't have snapshots, backup, live migration and so on).

ZFS is a big part of what I use currently. From what I can tell the only way to have encryption is to use the file storage method on a zfs dataset. From what I can tell not using the zfs storage driver has drawbacks.

We already provided zfs packages in our repositories before, but there was no dedicated SR driver. Users would use the file driver, which has a major drawback: if the zpool is not active, that driver may believe that the SR suddenly became empty, and drop all VDI metadata.

I subscribed to this issue and dropped a few bucks in bug bounty. For now I'll probably stay with ProxMox. IT-Gateway mentioned a few ZFS things that I am likely to use.

ProxMox natively supports clone, destroy, snapshot and replicate features of ZFS. It can be installed on top of ZFS pool, which makes it easy to roll back a bad update or missconfig, etc. On contrary XCP-Ng has only started it’s ZFS journey and it has a lot of rough edges. No ability to install it on top of ZFS, no support for encryption, nor there are snapshot/destroy/replicate features included. It treats it as a regular file system, and not CoW file system ZFS is.

– Storage encryption. None of the projects natively support REST encrypted volumes. ProxMox is a little better, because you can use encrypted ZFS datasets, but only on a secondary zpool due to compatibility issues with GRUB.

I seem to remember this being brought up in the Lawrence Systems videos.

I'll be keeping an eye on XCP-ng though, I really like the interface of Xen-Orchestra. I especially like that it can run in a VM and be decoupled from the node.

olivierlambert commented 2 years ago

@MatiasVara started to work on a ZFS driver for SMAPIv3, as a good way to explore it and push its limits :)

pebenito commented 1 year ago

I definitely hope for progress on this. I have a KVM system with mdadm raid -> luks -> lvm that I'd like to migrate to XCP-ng, though I don't need the network unlock.

rjt commented 1 year ago

ZFSBootMenu : Works with both native ZFS encryption and LUKS.

-

ZFSBootMenu was designed around native ZFS encryption. ZFSBootMenu will prompt for passphrases as necessary to unlock pools or filesystems needing to support user interaction or the standard boot process.

ZFSBootMenu also works with the standard LUKS dracut module to allow booting from ZFS pools on a LUKS encrypted device.

snapshots of your boot environment

ssh from within initrd!

Recv a initrd and kernel over the network and start it up.

https://zfsbootmenu.org/

https://github.com/zbm-dev/zfsbootmenu/

On Fri, Jun 17, 2022 at 11:18 AM Olivier Lambert @.***> wrote:

@MatiasVara https://github.com/MatiasVara started to work on a ZFS driver for SMAPIv3, as a good way to explore it and push its limits :)

— Reply to this email directly, view it on GitHub https://github.com/xcp-ng/xcp/issues/463#issuecomment-1159033138, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACX7F7BOOZJOLYFUG7BHOTVPSQLVANCNFSM4USBLNRQ . You are receiving this because you commented.Message ID: @.***>

0x1F680 commented 7 months ago

Is it possible to encrypt the root drive with cryptsetup (full disk encryption)? Currently not worried about network unlock.

ydirson commented 7 months ago

Is it possible to encrypt the root drive with cryptsetup (full disk encryption)?

No, that's not supported. Since we're essentially targeting server setups, unlock would really be something to be solved first.

0x1F680 commented 6 months ago

How about loading a couple of systemd hooks and network driver modules into initramfs (with mkinitcpio) with the necessary configuration files to allow for remote/autonomous luks-cryptdev unlocking?

fefe79 commented 1 month ago

Is it possible to encrypt the root drive with cryptsetup (full disk encryption)?

No, that's not supported. Since we're essentially targeting server setups, unlock would really be something to be solved first.

are you referring to TANG here or dracut? On server setups isn’t that the defacto? I do not get it, how server setups are managing data at rest and secure boot if xcpn ng does not support it? What kind of physical security is that?

How about and I am just brainstorming here: RED HAT’s NBDE - Tang/Clevis Set-Up or optionally an similar simple thing as an option at Install using luks and perhaps setup automatically and use dracut or similar with a local or a remote TANG if router/firewall is setup to forward things properly over VPN? Wouldn’t this be as you put it an “ targeting server setups” scenario?

rwjack commented 1 month ago

Encryption at rest is a literal must for compliance. It's quite odd how there hasn't been much news regarding this issue in XCP-ng.

olivierlambert commented 1 month ago

Hi,

We are interested in getting that, but so far no customer pushed for it, despite having very sensitive installations. It's also probably because you can have your shared storage encrypted at REST or do encryption in your VM directly.

Anyway, I'm not against it at all, it's not just a top priority for now, but this can change depending on the demand and our progress on other more urgent requests.

rwjack commented 1 month ago

but so far no customer pushed for it

Understandable.

have your shared storage encrypted at REST

This is really not a proper solution. Take a TrueNAS VM for example, yes the storage is encrypted at rest, but the VM itself with keys to that storage is not, making the whole encryption effectively useless.

or do encryption in your VM directly.

This is basically a workaround, which is not scalable at all, one or two VMs, sure, but more than that, becomes impossible to manage.

Anyway, I'm not against it at all, it's not just a top priority for now, but this can change depending on the demand and our progress on other more urgent requests.

Got it, thanks for clarifying. Hope we get some updates on this soon!

olivierlambert commented 1 month ago

This is really not a proper solution. Take a TrueNAS VM for example, yes the storage is encrypted at rest, but the VM itself with keys to that storage is not, making the whole encryption effectively useless.

I'm not sure to understand. First, using a VM as a shared SR isn't a good practice outside a home lab, so you'll never see this in production. Also, the first goal is to avoid getting the drives physically stolen (and only the drives). For example, if you unlock with the local TPM, if the entire machine is stolen, you'll access the data. Sensitive installations are airgap, so we cannot rely on using an external resource to automatically unlock the drive. Having a pwd on boot is also not acceptable for a server device.

This is basically a workaround, which is not scalable at all, one or two VMs, sure, but more than that, becomes impossible to manage.

It's a viable solution, eg you have Packer/Terraform or similar solutions (IaC) to generate your VMs automatically and have your templates with the right configuration. We've seen that for users/customers at scale.

Got it, thanks for clarifying. Hope we get some updates on this soon!

As you can see, there's not only one solution for this, and it mostly depends on the use case (air gap in sensitive context, home lab, size of the infrastructure). That's why it's not "odd" because it's not a simple/single thing to solve.

rwjack commented 1 month ago

using a VM as a shared SR isn't a good practice

I know, I'm not using it as a shared SR, it's used as NAS. The disks are passed through directly to the VM, so they are encrypted by the VM, but nothing encrypts the VM which holds the encryption keys for the disks.


Also, the first goal is to avoid getting the drives physically stolen (and only the drives).

Exactly, so what happens when someone takes the drive with VMs on it? Not good, since they're not encrypted.


Having a pwd on boot is also not acceptable for a server device.

Not acceptable through a console, I agree, but:

Are all valid options for admins to chose, that cover most use cases, airgapped / sensitive or just compliance wise.


or do encryption in your VM directly.

This is basically a workaround, which is not scalable at all, one or two VMs, sure, but more than that, becomes impossible to manage.

It's a viable solution, eg you have Packer/Terraform or similar solutions (IaC) to generate your VMs automatically and have your templates with the right configuration. We've seen that for users/customers at scale.

Right, but we're back to square one. Someone takes the entire host or even just the VM disk (in my single host non-shared-SR case). That opens a pathway to the encrypted storage disks. And even in the case of Shared SR storage, when someone takes the disk of the Shared SR storage master, then they have access to the Shared SR storage master decryption keys.


So what we really need is one master/top level encryption lock to rule them all. If you can imagine an infrastructure pyramid regarding proper encryption configuration, XCP-ng would be at the top there.

A blunt example, you usually lock your main door when you're at home, not just your work room and bedroom, right? Locking the main door would be complete VM disk, or even XCP-ng encryption, locking just the work room and bedroom would be per VM encryption.

olivierlambert commented 1 month ago

I see your point and since XCP-ng is fully open source, we'll be happy to assist on merging code providing a solution, knowing the constraints (maintenance, updates, dealing with pools where you can lose the master, backups and so on).

fefe79 commented 1 month ago

I see your point and since XCP-ng is fully open source, we'll be happy to assist on merging code providing a solution, knowing the constraints (maintenance, updates, dealing with pools where you can lose the master, backups and so on).

What do you guys think:

olivierlambert commented 1 month ago

Hi,

Sadly it's far from being that simple. Host secure boots needs to have some work done in Xen (or around Xen). There's people working on it as we speak, but this first requirement is not finished yet (and then you need to integrate it in XCP-ng and make it stable/updated etc.)

LUKS on install: so you will encrypt system disks right? And then password on boot? Aren't we looking for VM encryption? Because that is done in a different place, in tapdisk (and this kills the performance probably).

earzur commented 1 month ago

what we do here for encrypted storage (Debian 12):

Issues:

So far, this is as secure as we could go, but yet, the lack of proper emergency recovery procedure prevents making it to production...

earzur commented 1 month ago

finally, before deciding, make sure to check cryptsetup benchmark on your dom0... we're seeing 4x difference on our older Xeon systems compared to modern CPUs (even laptops)