QubesOS / qubes-issues

The Qubes OS Project issue tracker
https://www.qubes-os.org/doc/issue-tracking/
541 stars 48 forks source link

[Feature Request] Generalized automatic unidirectional data transmission API between VMs for endpoint secure communication #6102

Closed maqp closed 4 years ago

maqp commented 4 years ago

Hi, this became quite a long post, I hope it's ok.

The current situation of secure comms on Qubes

The available solutions for secure communication on Qubes have so far been


Quick overview of TFC's endpoint security

A TFC endpoint on Qubes consists of

Only the Networker VM is connected to the Internet. The Source VM talks to the Networker VM and the Networker VM talks to the Destination VM, unidirectionally. It's the method of unidirectionality that's of interest here, as the reasoning for the endpoint security is, malware needs bidirectional access to exfiltrate keys:

Since data transits to TCB halves unidirectionally

TFC was originally designed around three separate computers (your daily laptop + 2 cheap netbooks): this layout makes use of free-hardware-design data diode, and as long as you manage to avoid HW implants and eliminate covert exfiltration channels, it's pretty much safe from zero-days.

TFC's Qubes configuration is a relatively new addition: It's an intermediate solution that sits in-between the Qubes' split-gpg, and the HW data diode isolated setup. The security of the Qubes-isolated configuration depends on the robustness of the isolation mechanism.

What this ticket is about is, I'm not sure the current approach TFC AppVMs use to communicate unidirectionally is the best possible, and that additional feature(s) to change it to the best possible, might be needed.


TFC's current Qubes configuration

The current Qubes configuration uses iptables firewall to enforce the following rules:

So the theory is, the sys-firewall is more inaccessible to the attacker, and that data that passes through it can't edit the firewall rules (it wouldn't be much of a firewall if it could).


Qubes already has non-LAN based (unidirectional) data transfer

So to get back to the PGP-emails, split-gpg uses something called qr-exec protocol (qubes.GPG). This protocol apparently has similar API to gpg that Thunderbird uses, but I suspect the qubes-gpg-client and qubes-gpg-server use something other than the internal LAN to exchange data, especially since the documentation states qubes-gpg-server runs in "more trusted, network-isolated, domain".

It's also the case network-isolated VMs can still exchange files via nautilus with right-click -> Move To Other AppVM... so apparently there's benefit in having dom0 perform the unidirectional data transfer between AppVMs.

In theory TFC's Transmitter and Relay programs could write the ciphertext into a file, and once the user has manually sent the ciphertext from Source to Networker VM, or from Networker to Destination VM, the Relay and Receiver programs could automatically poll the QubesIncoming/<sourceVM> directory for the ciphertext file, process the data in the file and remove the file (to allow the next file with identical file name to be sent). But this isn't what instant messaging is about.


The feature request: Application for automated data transfer via dom0

So would it be possible to create some application that could then be called with some parameters, e.g. inside Python3.6+

subprocess.Popen(f'qubes-send-unidirectional --target-vm destination --port 1234 --data {ciphertext content}', shell=True).wait()

A server application running as a service, would be needed to listen on the target VM, and upon receiving data, it could then broadcast the data over e.g. UDP (to localhost:1234) that the Relay/Receiver program could then listen to.

This would require dom0 side AppVM management to have some simple configuration (see the next chapter below) that would make it harder to misconfigure (as there's no iptables).

Again, I'm not asking to implement this only for TFC. Generalized unidirectional transfer would make Qubes a lot more flexible. The server application could also have some configuration file that determines what commands (if any) get run when data is received to some port. E.g. the received data could also be piped to a gpg command for automatic decryption; This way there's no need to trust TFC (although its forward secrecy etc. are an improvement over PGP's stateless encryption), and the PGP-environment is more secure when compared to current split-GPG, as plaintext is also viewed and/or composed on a safe isolated VM. The configuration file parsing would of course require careful design to prevent shell injection.


Making misuse hard

I realize that automatic processing of incoming data is in a way, dangerous to the receiving AppVM, but that's something that's already possible (via UDP packets). If the unidirectionality can be enforced more robustly with dom0, I think it's definitely worth doing. Users who would use the feature would probably only use it for dedicated, isolated AppVMs anyway, and the point of Qubes is, an AppVM, even if dedicated for single purpose doesn't really cost anything.

(Also, it's a good thing to recap the current situation: I'd argue having always-online-when-running work-email VM show plaintexts is even more dangerous, as it can directly affect the security of the user.)

To make misuse hard I think it's best to offer a dropdown menu called Unidirectional AppVM type below Template and above Networking when creating a new qube, that has five options:

  1. (none): This is the default, and it allows setting any network interface. Both checkboxes are hidden when (none) is selected.

    This configuration must not allow unidirectional transfers to any AppVM, only Relay (explained below) must be able to do that

  2. Source: This will set networking to (none) and prevent changing it. It will then expose

    ☐ Allow automatic incoming data to this VM from following VMs: n/a (I'm using strike through to represent a greyed out option) ☑ Allow automatic outgoing data from this VM to following VMs: networker, destination (The in-line code is what the user can type here, basically it's a comma-separated list that allows output to, or reception from one or more AppVMs)

  3. Relay: This configuration will always be online, so selecting it should prevent setting networking to (none), but allow free selection of network interface. It will then expose both

    ☑ Allow automatic incoming data to this VM from following VMs: source ☑ Allow automatic outgoing data from this VM to following VMs: destination

    Selecting this configuration should display a warning message along the lines of

    "WARNING: This AppVM is ALWAYS connected to the network. It has no additional security from (unidirectional) network-isolation. Make sure to only use it to transfer encrypted data between network and other types of AppVMs!"

  4. Destination: Again, similar to Source configuration, the networking interface should be locked to (none), but this time the other option should be unselectable:

    ☑ Allow automatic incoming data to this VM from following VMs: source, networker ☐ Allow automatic outgoing data from this VM to following VMs: n/a

    Selecting this configuration should display a warning message along the lines of

    "WARNING: Automatic processing of received data can result in altering, or loss of stored data. Make sure you have a proper backup of all valuable data stored on this AppVM!"

  5. Guard: A Guard is an optional AppVM. It is similar to Relay in that it can relay unidirectional data from one AppVM to another. Where it differs is, the guard AppVM is always isolated from the network, so the networking interface must be locked to (none). The way this would work is one would set

    source AppVM with settings

    ☐ Allow automatic incoming data to this VM from following VMs: n/a ☑ Allow automatic outgoing data from this VM to following VMs: source-networker-guard

    source-networker-guard AppVM with settings

    ☑ Allow automatic incoming data to this VM from following VMs: source ☑ Allow automatic outgoing data from this VM to following VMs: networker

    networker AppVM with settings

    ☑ Allow automatic incoming data to this VM from following VMs: source-networker-guard ☑ Allow automatic outgoing data from this VM to following VMs: networker-destination-guard

    networker-destination-guard AppVM with settings

    ☑ Allow automatic incoming data to this VM from following VMs: networker ☑ Allow automatic outgoing data from this VM to following VMs: destination

    destination AppVM with settings

    ☑ Allow automatic incoming data to this VM from following VMs: networker-destination-guard ☐ Allow automatic outgoing data from this VM to following VMs: n/a

    For the system to work as a whole, the Guard must relay the content of the data. However, inspecting/logging the data should be up to the user. The idea here is it can act as an audit/IDS/IPS system for transmitted plaintext data (especially plaintext data), and analyze metadata of encrypted transmissions, e.g. it could detect covert exfiltration of keys during hours no data was sent by the user. The guard benefits especially from using another type of OS (e.g. a BSD-variant when everything else is Linux), as it's unlikely to be compromisable by the same exploit that could then alter what's in the guard's logfile.

    To avoid the need for a complex network topology analysis to prevent the user from shooting their leg off by creating unintended exfiltration pathways, the Guard VMs should not contain duplicates of decryption keys for friendly MITM purposes, and they should always show a warning about this:

    "WARNING: Do not store cryptographic keys for deep packet inspection on this Guard AppVM!"


How complex is this for the end user?

I'd say not complex at all. The software vendor can make everything else automatic except

Also, naturally the user has to manually configure the AppVMs via dom0, i.e.

Installation guides can walk them through these steps.

iamahuman commented 4 years ago

It's worth mentioning that there are a few existing covert channel issues in Qubes that need to be addressed. In fact, they are severe enough to the extent of completely voiding the data-diode model (specifically the confidentality aspect, or the "Destination Computer" in TFC):

Both of these allow two cooperating VMs to communicate feasibly with each other.

iamahuman commented 4 years ago

Redirecting stdout and stderr to /dev/null may prove sufficient for lax requirements (and is an improvement over iptables blocking); otherwise, Qubes RPC still has the following limitations:

marmarek commented 4 years ago

The thing you're looking for is qrexec. Make Source VM and Destination VM not connected to the network and use qrexec services for communication. Bonus point: you remove the whole TCP/IP stack from the TCB.

In general it provides bi-directional transport layer with strict policy who can talk to who and on request what services (analogous to port numbers). Currently the dom0 policy cannot enforce uni-directional connection, but it can be implemented by the actual service receiving the connection (like - the first thing is to close stdin/stdout and don't read/write anything from there).

And finally, yes, @iamahuman is right - with the current state of things, it is unrealistic to assume a compromised VM will not be able to exfiltrate things. It is somehow harder than on traditional system (you may need another cooperating VM for that), but the general assumption is if a VM is compromised, it is not safe anymore the data inside stays private.

andrewdavidwong commented 4 years ago

Based on @marmarek's response, it sounds like there is no need for a new feature in Qubes, since qrexec already exists and fits the bill. Therefore, I'm closing this feature request as "won't do" (perhaps it's arguably "already done"). If anyone thinks that this issue should be reopened, please leave a comment briefly explaining why, and we'll be happy to take another look. Thank you.

maqp commented 4 years ago

Hi again! Thank you so much for your time and thoughtful replies!

@marmarek

The thing you're looking for is qrexec.

Thanks! I managed to get it working, not sure if the new implementation (below) is correct and/or ideal.

the first thing is to close stdin/stdout and don't read/write anything from [the actual service receiving the connection]

I closed stdout and stderr on receiving side, but I'm puzzled what you mean I should close stdin, how can I pass a long ciphertext to the RPC service? I tried passing data as an argument but the input was limited to 47 chars.


@iamahuman

4216 may serve as a possible mitigation for the CPU part, though.

Thanks! I'll have keep track of the ticket and see if (once implemented) it requires manual setup (like picking dedicated CPU cores) from the end user.


@marmarek

Source VM -> Network VM: should be good enough here - theSource VM is the trusted entity here and it can enforce the uni-directional connection (prevent reading anything from Network VM).

If the dom0 only has one policy file tfc.SourceNetworker for these two VMs, with the content--

SourceVM NetworkerVM allow

--doesn't this lock the direction of qrexec requests and prevent any connection attempts from NetworkerVM? Or does it require additional blocking procedures from dom0/SourceVM side? I redirected stdout and stderr from Networker to /dev/null on SourceVM side below, just in case.


@iamahuman

In fact, they are severe enough to the extent of completely voiding the data-diode model (specifically the confidentality aspect, or the "Destination Computer" in TFC):

Both of these allow two cooperating VMs to communicate feasibly with each other.

@marmarek

But in practice enforcing the uni-directional connection on both sides here forces the attacker to compromise two VMs to exfiltrate things, which may he barrier high enough

I agree these are the limitations. It's an improvement over the networked TCBs in that it forces the attacker to squeeze the malware through the TFC's data transport channel. I'm hoping Qrexec + TFC's authentication mechanisms make this hole too small for most adversaries.

I'll do my best to expand the threat model to discuss the attacks you mentioned: side channels inside the CPU, as well as flow control timing and the issues with Xen shared memory. The details of these attacks are way beyond my comfort zone so it's going to take a while to digest all this.


@iamahuman

Redirecting stdout and stderr to /dev/null may prove sufficient for lax requirements

I tried implementing this with shell script that calls the actual utility, while suppressing the stdout/stderr of said util from NetworkerVM side (see below)

Even if the communication may be unidirectional, the receiving end can still signal the sender when the buffer is full.

IIUC all this happens way before anything that my code (below) on receiving side does, so if despite best efforts DestinationVM is compromised, malware could leak data without even touching TFC's files?

In contrast, the TFC hardware implementation appears to have no clock driven by the receiver.

(That's correct, the serial transmissions are asynchronous so the first bit is the start bit that starts the clock on receiving side, the baudrate is pre-established and has so far stayed in sync well enough.)


The new implementation

I tried really hard to get this right:

Relay Program on NetworkerVM calls

subprocess.Popen(f"echo {pipes.quote(message)} |qrexec-client-vm DestinationVM tfc.NetworkerDestination", shell=True).wait()

The dom0's /etc/qubes-rpc/policy/tfc.NetworkerDestination policy file has automated forwarding:

NetworkerVM DestinationVM allow

The Destination VM's (receiver side) /etc/qubes-rpc/tfc.NetworkerDestination is a symlink to /opt/tfc/supressor.sh that has following content:

#!/bin/sh
read arg1
echo $arg1 |python3 /opt/tfc/writer.py 1>/dev/null 2>/dev/null

This should remove all output on NetworkerVM side the RPC call would produce so it should meet the lax requirements @iamahuman mentioned. At least, when I induce syntax errors into writer.py, I get no error messages on NetworkerVM side, which isn't the case when directly symlinking the writer.py.

(On a side-note and since it's related to stdio, to do what can be done, the Transmitter Program on SourceVM also redirects stdout and stderr to /dev/null as part of its shell command, to get some additional protection from the NetworkerVM:

subprocess.Popen(f"echo {pipes.quote(message)} |qrexec-client-vm NetworkerVM tfc.SourceNetworker 1>/dev/null 2>/dev/null", shell=True).wait()

)

The final utility is the /opt/tfc/writer.py:

#!/usr/bin/env python3
import base64, os, sys
with open('/home/user/.tfc_from_relay_<timestamp>', 'w+') as f:
    data = sys.stdin.read()[:-1]  # Remove LF
    f.write(base64.b64encode(data.encode()).decode())
    f.flush()
    os.fsync(f.fileno())

The Base64 encoding should ensure any raw binary data -- sent by the attacker who has taken over NetworkerVM -- that is stored into the temp file can not be accidentally executed. The TFC's main program can then read, decode, validate, and authenticate+decrypt the data before trusting it. The filenames with timestamps ensure data is processed in order.


@andrewdavidwong

You're right, the feature I asked for already exists and works to the extent it can!

marmarek commented 4 years ago

I closed stdout and stderr on receiving side, but I'm puzzled what you mean I should close stdin, how can I pass a long ciphertext to the RPC service? I tried passing data as an argument but the input was limited to 47 chars.

I mean close stdin on the side where you want to prevent sending data and close stdout on the side you want to prevent reading data. Not both at the same time.

SourceVM NetworkerVM allow

--doesn't this lock the direction of qrexec requests and prevent any connection attempts from NetworkerVM?

Yes it does. But once a connection is established, data can flow there in both directions.

I redirected stdout and stderr from Networker to /dev/null on SourceVM side below, just in case.

Yes, that's exactly what you can do.

subprocess.Popen(f"echo {pipes.quote(message)} |qrexec-client-vm DestinationVM tfc.NetworkerDestination", shell=True).wait()

You can avoid echo and shell: subprocess.Popen(['qrexec-client-vm', 'DestinationVM', 'tfc.NetworkerDestination']).communicate(message.encode())

You can also add stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL to redirect output to /dev/null.

!/bin/sh

read arg1 echo $arg1 |python3 /opt/tfc/writer.py 1>/dev/null 2>/dev/null

Right direction, but better avoid shell processing the message. You can simplify with:

#!/bin/sh
head -n 1 | python3 /opt/tfc/writer.py 1>/dev/null 2>/dev/null

(if you don't really require getting just one line, drop head -n 1)

!/usr/bin/env python3

import base64, os, sys with open('/home/user/.tfc_fromrelay', 'w+') as f: data = sys.stdin.read()[:-1] # Remove LF f.write(base64.b64encode(data.encode()).decode()) f.flush() os.fsync(f.fileno())

You can simplify to avoid converting unicode<->bytes:

#!/usr/bin/env python3
import base64, os, sys
with open('/home/user/.tfc_from_relay_<timestamp>', 'wb+') as f:
    data = sys.stdin.buffer.read()[:-1]  # Remove LF
    f.write(base64.b64encode(data))
    f.flush()
    os.fsync(f.fileno())
iamahuman commented 4 years ago

To copy from the mailing list archive due to #6008:

On 04/11/14 19:33, Joanna Rutkowska wrote:

On 04/11/14 19:16, Marek Marczykowski-Górecki wrote:

[offlist] Assuming two cooperating VMs, IMO it is doable to establish not only low-bandwidth covert channel, but simple xen shared memory, which would be high-bandwidth channel. You need for this:

  1. the other domain ID (on both sides) - rather easy to guess as it is int < 100 in most cases (*),
  2. grant ref (on "receiving" side), also some small int, looks like <10^6 (**).

The whole operation would be:

  1. VM1 allocate some memory page, share it to all reasonable domain ids (100 grants or so)
  2. VM2 tries to bruteforce domain id of VM1 and grant reference - something about 100*10^6 tries, perhaps even less if some hypercall error code allows to validate domain id (distinguish invalid domain id from invalid grant reference).

(*) If one of this domains is netvm, it is very likely it has XID 1. (**) AFAIR that grant ref is assigned by sharing VM kernel. Maybe if attacker control the VM1 kernel, he/she have influence on grant reference number, so can use some known value.

Ah, that's true. I wonder whether Xen XSM allows to setup a policy on which VMs could share pages with which VMs? If it did (and I would expect so), it would be the actual first useful thing in practice that XSM provides. Perhaps we should investigate this for Qubes R3?

joanna.

[Sorry for screwing up the CC -- now copying to qubes-users, as it should.]

And then, there is also xenstore, which would have to be audited to not allow one VM to publish anything readable by (select) other VM.

So, these are probably doable things (whether via XSM or via custom patches on Xen), the question is, however, is it really worth it? To answer that question we should first estimate the real-world bandwidth through cache-based and other hard-to-eliminate channels, because if these alone provide for large communication, then it's not worth to tinker with Xen IMHO.

joanna.

iamahuman commented 4 years ago

@iamahuman

4216 may serve as a possible mitigation for the CPU part, though.

Thanks! I'll have keep track of the ticket and see if (once implemented) it requires manual setup (like picking dedicated CPU cores) from the end user.

The manual route is, in fact, already available: virsh vcpupin <domain> <vcpu> <cpulist>—only that this does not persist across reboots or VM shutdown/startup. That said, a true mitigation would require at least two physical CPU cores, but at that level I believe the user could just prepare a dedicated hardware for themselves as well.

iamahuman commented 4 years ago

Even if the communication may be unidirectional, the receiving end can still signal the sender when the buffer is full.

IIUC all this happens way before anything that my code (below) on receiving side does, so if despite best efforts DestinationVM is compromised, malware could leak data without even touching TFC's files?

Yes, it could. Exfiltration can happen anytime after dom0 approves the qrexec connection. (That is, if we ignore other side channel issues that will allow data exfiltration even without dom0 intervention anyway.)

maqp commented 4 years ago

@marmarek

You can avoid echo and shell: subprocess.Popen(['qrexec-client-vm', 'DestinationVM', 'tfc.NetworkerDestination']).communicate(message.encode())

This worked great! I set the executable path as absolute, and added an additional layer of encoding to output data to prevent the receiving application from misinterpreting line feed bytes present in the random data that is ciphertexts/public keys.

I followed the rest of your advice too, and I'm so glad to have a more secure Qubes configuration in production now.

@iamahuman I linked to to the message with Joanna's email in the threat model and listed the issues discussed in this thread.

The manual route is, in fact, already available--

I tried playing with the virsh command but it just returns error: no connection driver available for <null>. But maybe this can be left to another ticket, another time.

Again, a million thanks to both of you!

iamahuman commented 4 years ago

I tried playing with the virsh command but it just returns error: no connection driver available for <null>.

I forgot to mention you should use sudo. Bites me all the time, since most of the commands work just fine in Qubes w/o root until you dig that deep into the system...