QubesOS / qubes-issues

The Qubes OS Project issue tracker
https://www.qubes-os.org/doc/issue-tracking/
534 stars 46 forks source link

Forensics-proof DisposableVMs #1819

Closed Rudd-O closed 5 years ago

Rudd-O commented 8 years ago

This is a feature request.

User requests DisposableVM via UX interacton.

dom0 script in charge of DisposableVM setup sets up the root file system as a device-mapper device. Then it sets up the swap device and the home directory device in the following manner:

(To be honest, the swap devices of all VMs should be made atop that).

Teardown of devices is the exact opposite -- once the VM is dead, the devices must be luksClosed and then luksWiped.

Presto correcto mundo — unrecoverable devices associated with DisposableVMs, so long as the user does not write to anything other than /home.

This should not be too much of a complication compared to DisposableVM setup today.

bnvk commented 8 years ago

This seems like a duplicate of #904

Rudd-O commented 8 years ago

Yes, close as dupe please.

andrewdavidwong commented 5 years ago

Reopening and unmarking as a duplicate at the request of @brendanhoar in https://github.com/QubesOS/qubes-issues/issues/1293#issuecomment-458124101:

1819, closed as duplicate of #904, though 1819 is asking for encryption, and 904 is asking for in-memory execution, both also for anti-forensics.

904, still open, could use a re-title (I think) if #1819 is to be considered a duplicate. The core ask for both is that dispVMs not leak data into other domUs or dom0. The proposed in-memory-only execution + in-memory volatile storage is one solution. However, RAM still seems to be one of Qubes' areas of difficulty. I will also note the word "currently" here is perhaps indicating something that is not optimal: " Reported by joanna on 25 Sep 2014 20:26 UTC 'Currently volatile.img is being backed up on the fs.'" An alternative to in-memory storage would be ephemeral encryption keys for swap/volatile storage or all storage for dispVMs (as I suggested in my earlier post as well as #1819 suggests) with additional anti-leak guarantees or mitigations, such as dom0 swap also having an ephemeral key generated on each boot, Xen's built-in memory scrubbing, etc.

andrewdavidwong commented 5 years ago

Reopening and unmarking as a duplicate at the request of @brendanhoar in #1293 (comment):

1819, closed as duplicate of #904, though 1819 is asking for encryption, and 904 is asking for in-memory execution, both also for anti-forensics.

904, still open, could use a re-title (I think) if #1819 is to be considered a duplicate. The core ask for both is that dispVMs not leak data into other domUs or dom0. The proposed in-memory-only execution + in-memory volatile storage is one solution. However, RAM still seems to be one of Qubes' areas of difficulty. I will also note the word "currently" here is perhaps indicating something that is not optimal: " Reported by joanna on 25 Sep 2014 20:26 UTC 'Currently volatile.img is being backed up on the fs.'" An alternative to in-memory storage would be ephemeral encryption keys for swap/volatile storage or all storage for dispVMs (as I suggested in my earlier post as well as #1819 suggests) with additional anti-leak guarantees or mitigations, such as dom0 swap also having an ephemeral key generated on each boot, Xen's built-in memory scrubbing, etc.

After thinking about this some more, I'm inclined to say that our policy should be to call issues duplicates when they have the same goal, even if they request two different means by which to achieve that shared goal. If the Qubes devs agree with the goal, we should leave it to them to decide (after consulting with the community) on the best means by which to achieve that goal.

For example, we now have several different open issues all requesting anti-forensic DisposableVMs. The only reason they're all (re)open is because they ask for that same thing to be implemented in different ways. However, this is very similar to an XY problem. We shouldn't endorse that problematic practice by intentionally leaving all such issues open. Rather, it is more organized to discuss the possible implementations on a single issue. Therefore, we should identify such issues as duplicates because they all share the same goal, then leave it to the Qubes devs to decide which implementation is best.

What do you think, @brendanhoar?

brendanhoar commented 5 years ago

Thanks for the thoughtful writing above, @andrewdavidwong. I learned a new term, too! Though I am quite familiar with the scenario... :)

I am ok with you closing out my requests as duplicates. I do think it's worth taking a step back and separating out the high-level asks to see where they overlap and perhaps obtain some clarity that might help with a) deciding what should be implemented and b) how the relations between these enhancements might drive the design. That might inform which qubes-issues to leave open and/or help decisions on milestones.

Below are just my opinions, of course, hopefully not excessively redundant with what I have written before.

Generally, there are at least five distinct "major" goals that overlap:

  1. Storage Provisioning assistance.
  2. Anti-forensics and/or "amnesiac" behavior for Disposable VMs.
  3. Anti-forensics and/or "amensiac" behavior for Deleted VMs.
  4. Optional per-VM encryption layers.
  5. Encrypted templates. [But why? Here just for completeness so that it can be explicitly ruled out.]

Solutions to any one of the goals should be designed to keep the other possible goals in mind or to incorporate them.

For No. 1, the default R4 Qubes storage stack design is utilizing 'discards' to notify thin provisioning that space can be released back to the pool. Users can make configuration adjustments to pass them all the way down to the hardware. I have argued (perhaps poorly) that discards should always be sent down the stack as far as possible. It is becoming more common, in this "cloud-based" computing world, to ensure discards are routed all the way down the stack to the lowest level your environment has control of. That "receiving" lower level should have been designed to accept them and pass them along as well, etc., until they hit the storage medium itself, if applicable, so that the system and hardware maintains performance. [As a side-effect 'discards' at the hardware level have partial anti-forensics/amensiac results, particularly on contemporary SED SSDs from reputable manufacturers where a) the discard of large blocks puts the data "out-of-scope" of the entire dom0; b) the logical block number is fed into the mode and that input is lost on discard when the blocks are put on the free and/or erase lists and c) the free/erase lists are managed well.]

As to No. 2, I'm a belt and suspenders kind of guy. Defense-in-depth...but not ad absurdum. e.g for qubes I recommend combining a SED SSD, properly provisioned to require user authentication plus the optional LUKS layer that Qubes provides that does a very similar thing. This addresses "storage at rest" anti-forensics. What the ask is: have Qubes also address live system anti-forensics.

If the design/development team intends to add anti-forensics/amnesiac capability to Disposable VMs, I will be happy with whatever implementation method is used, as long as it is difficult for a user to mess up. :) @Rudd-O's suggestion above sounds pretty straightforward, adding an additional mapping layer with LUKS on top of the thin-provisioned LVs, keys in RAM only. I also like that key-management in dom0, not the VMs themselves, can reduce leakage/compromise of keys. My one area of concern is that I don't know if the lvm snapshots/volatile implementation in Qubes would allow for insertion of a luks layer. I'd like to hear @tasket's thoughts on the subject.

For No. 3: again this is live system anti-forensics. The current way to address a data handling mistake in any VM is, unfortunately: delete the VM(s) in question, back up the remaining Qubes VMs, wipe the drive, reinstall the Qubes OS and restore the VMs that do not have this issue. Why? Because deletion of a VM in the pool leaves all the data there until the blocks are needed by another VM. Sure the data will be wiped before provisioning the blocks to the other VMs, but the data is still on the system. Wiping unused space in a pool is...difficult. I have been unable to find a tool that wipes/scrups/discards the unallocated space in an lvm pool which is disappointing.

Possible solutions to the deleted-VM issue could include an optional wiping before deletion, where the tool/gui asks the user during a deletion request if they would like the LVs wiped before they are removed from the pool (works best with HDDs) and regardless of the answer automatically blkdiscard before deletion (works best with SED SSDs if configured to flow the discards down the stack). [Wipe plus blkdiscard could also work for Disposable VMs, but transparent ephemeral encryption via RAM-only key plus blkdiscard would give a better user experience in most cases, esp. if the user is handling large data sets.] Notably, proxmox includes a "saferemove" feature (with an optional throughput limiter) in their wrapper for lvm.

No. 4 - Most likely Nos. 2 and 4 should be designed together. This would require a local encrypted key store addition. One or more slot #s associated with VMs, unlocking keys in other slots for each of the VM's /rw volumes, volatile volumes treated similar to Disposable VM volumes (key in RAM), etc.

No. 5 - I can't think of a use case for this, but just putting it out there so that it can be explicitly ruled out.

Brendan

tasket commented 5 years ago

Perhaps the biggest obstacle here is that Qubes expects a "writable" or write-then-forget model of the template root filesystem, and this currently requires a snapshot-capable layer like LVM or Btrfs. I don't know if you can insert an additional LUKS layer and still have snapshots work (I would guess not). Other possibility is if some other block driver can do a cross-device COW mode – isn't dm-snapshot used for this?

Without a writeable root fs, a template must be tweaked to expect a read-only root but I suspect it wouldn't be very functional.

Discards are almost a minor detail here. Encryption will take care of security concerns and wholesale discard can be performed once when the VM is destroyed.

marmarek commented 5 years ago

Perhaps the biggest obstacle here is that Qubes expects a "writable" or write-then-forget model of the template root filesystem, and this currently requires a snapshot-capable layer like LVM or Btrfs.

This is a mere optimization we have in R4.0, since we can - thanks to LVM. Previously (R3.x and before) we've used dm-snapshot directly, and COW layer of root fs was constructed within VM. This code is still in place and is used automatically whenever root device (/dev/xvda) is read-only. Details here (this page is not updated to R4.0 yet): https://www.qubes-os.org/doc/template-implementation/

So, technically it isn't a problem to implement DispVM which have all disks either read-only, or encrypted with ephemeral key stored in dom0 RAM only. In practice, it means:

tasket commented 5 years ago

Thanks, Marek. So the key to implementation is getting familiar with this non-lvm COW code in Qubes and making dispVM rootfs work with it. If the target COW area is disk (i.e. a temporary thin lvm volume) it will have a dm-crypt layer with ephemeral key on top. Swap and tmp are trivial additions once root is configured to work this way.

One more note on discards: It may still be desirable to have dom0 examine discard support down the chain of storage layers as @brendanhoar suggests. Doing so could increase (by some degree) the privacy of dispvms that (for whatever reason/use case) are running for extended periods of time. Good to have for certain threat models.

marmarek commented 5 years ago

Thanks, Marek. So the key to implementation is getting familiar with this non-lvm COW code in Qubes and making dispVM rootfs work with it. If the target COW area is disk (i.e. a temporary thin lvm volume) it will have a dm-crypt layer with ephemeral key on top. Swap and tmp are trivial additions once root is configured to work this way.

It's even simpler: you "just" need to make "volatile" volume with dm-crypt layer. Once "root" volume is read-only, VM will use "volatile" for root COW and swap (and implicitly tmp, as its tmpfs).

andrewdavidwong commented 5 years ago

I am ok with you closing out my requests as duplicates. I do think it's worth taking a step back and separating out the high-level asks to see where they overlap and perhaps obtain some clarity that might help with a) deciding what should be implemented and b) how the relations between these enhancements might drive the design. That might inform which qubes-issues to leave open and/or help decisions on milestones.

Agreed.

Below are just my opinions, of course, hopefully not excessively redundant with what I have written before.

Generally, there are at least five distinct "major" goals that overlap:

  1. Storage Provisioning assistance.
  2. Anti-forensics and/or "amnesiac" behavior for Disposable VMs.
  3. Anti-forensics and/or "amensiac" behavior for Deleted VMs.
  4. Optional per-VM encryption layers.

I'm not quite sure I agree with this characterization. #1293 is asking for per-VM encryption, which is fundamentally about VM data confidentiality. Anti-forensics and/or "amnesiac" behavior seems to only partially cover the space of possible applications of VM data confidentiality.

@marmarek, do you have an opinion about what we should do with these possible duplicate issues?

marmarek commented 5 years ago

This looks like a duplicate of #904. Even though title may be slight different, both have the same goal, and #904 also mentions using encryption. As for other issues, those are related, but not really a duplicate (#4408 may be solved as a side effect of one or another, but that really depends on final implementation).

andrewdavidwong commented 5 years ago

This looks like a duplicate of #904. Even though title may be slight different, both have the same goal, and #904 also mentions using encryption. As for other issues, those are related, but not really a duplicate (#4408 may be solved as a side effect of one or another, but that really depends on final implementation).

Ok, closing as a duplicate of #904.