apyrgio commented 2 years ago

Primer

Dangerzone currently uses two containers (Docker containers on MacOS/Windows, Podman containers on Linux) for the conversion process:

The first container accepts a suspicious document/image as input, and produces a list of RGB files (corresponding to the pages of the document).
The second container accepts the list of RGB files, and produces the safe PDF document.

On this issue, we'll focus on protecting the users against an attacker who tries to take control of the first container.

Requirements

From the point where a conversion process reads an attacker payload, to the point where the attacker manages to escape to the host, there are several hurdles that they must jump. In order of importance, from the innermost hurdle to the outermost, these are the requirements from Dangerzone, to keep a user safe.

The attacker must not gain control of a process within the container.
The attacker must not access host data from within the container.
The attacker must not access the network from within the container.
The attacker must not become root within the container.
The attacker must not escape the container.
The attacker must not access host data outside the container.
The attacker must not become user/root outside the container.
The attacker must not escape the VM.

Current situation

The protective measures that Dangerzone has in place are:

(MacOS/Windows only) Containers run in VMs.
- This is necessary, as this the only way that Docker Desktop can run Linux containers in the rest of the OSes.
The conversion process in the container is not / cannot become root.
- The Dangerzone container image has a regular user in it, and the conversion process runs with their UID.
- No process can elevate themselves to root, since no new privileges can be acquired.
- See the relevant discussion: https://github.com/freedomofpress/dangerzone/issues/169
Containers have no capabilities (see capabilities(7)).
- We drop all capabilities when we create a new container, as we don't need them.
- See the relevant PR: https://github.com/freedomofpress/dangerzone/pull/183
Containers have no access to the network.

The question here is: can we improve on this situation even more?

Subtasks

Previous issues

apyrgio commented 2 years ago

Suggestions

We'll take each requirement we mentioned and try to suggest a protective measure that can aid in it:

1. The attacker must not gain control of a process within the container

Attackers will do so by exploiting a vulnerability in the image. We need to continuously monitor our container image for CVEs.

(Tracked in https://github.com/freedomofpress/dangerzone/issues/222)

2. The attacker must not access host data from within the container

The container has limited visibility to files in the host, since we only mount two temporary directories in it (for input and output files). However, what we can do is figure out if we can avoid mounting proc and sysfs in it. See https://unit42.paloaltonetworks.com/breaking-docker-via-runc-explaining-cve-2019-5736/ for a Docker exploit that heavily relied on procfs.

(Tracked in https://github.com/freedomofpress/dangerzone/issues/223)

3. The attacker must not access the network from within the container

This should be fully covered.

4. The attacker must not become root within the container

This should already be covered by the fact that the user within the container is a regular user, and the fact that we use --security-opt=no-new-privileges. However, to further reduce the attack surface, we could follow the latest CIS Docker advisories (official link, mirror):

4.3 - Ensure that unnecessary packages are not installed in the container
4.8 - Ensure setuid and setgid permissions are removed
5.12 - Ensure that the container's root filesystem is mounted as read only

(Tracked in https://github.com/freedomofpress/dangerzone/issues/224)

5. The attacker must not escape the container

The most important attack surface in a container is the Linux Kernel itself, and is what most attackers will try to exploit. A common countermeasure is to drop all capabilities, which we have already done, and have a strict seccomp profile.

(Tracked in https://github.com/freedomofpress/dangerzone/issues/225)

6. The attacker must not access host data outside the container

There is a lot of room for improvement in this area. Sorted by acces control strictness:

SELinux (tracked in https://github.com/freedomofpress/dangerzone/issues/226)
AppArmor (tracked in https://github.com/freedomofpress/dangerzone/issues/227)
User namespaces (https://github.com/freedomofpress/dangerzone/issues/228)

7. The attacker must not become user/root outside the container

This has to do with how up-to-date and hardened the Linux Kernel in the container host is. In Linux platforms, we don't have much we can do. In Windows/MacOS platforms, the Linux Kernel runs within a VM, so we may be able to take advantage of this.

8. The attacker must not escape the VM

Docker Desktop needs to mount the whole host in the guest VM, in order to use it for container mounts. See https://community.atlassian.com/t5/Trust-Security-articles/Hiding-malware-in-Docker-Desktop-s-virtual-machine/ba-p/1924743 on how this can be exploited.

An alternative to this would be to use MicroVMs. MicroVMs are small virtual machines that offer kernel isolation and small resource footprint, at the expense of some performance.

MicroVMs have already been paired with containers on some projects:

Kata Containers: https://katacontainers.io/
Krun: https://github.com/containers/krunvm

Note that this technology is still nascent, and not multi-platform.

apyrgio commented 2 years ago

Center for Internet Security (CIS) [3rd-party link]: Has a pretty nice write-up on how to harden the containers, from the image creation to the runtime. There are also a 3rd-party tools to audit the current configuration:
- https://github.com/docker/docker-bench-security
- https://github.com/containers/podman-security-bench
NIST (https://nvlpubs.nist.gov/nistpubs/specialpublications/nist.sp.800-190.pdf): Lesser known guidelines, which overlap with the above.
NSA (https://media.defense.gov/2022/Aug/29/2003066362/-1/-1/0/CTR_KUBERNETES_HARDENING_GUIDANCE_1.2_20220829.PDF): Focused on hardening Kubernetes clusters, but has some advice on Pods (containers) as well.
OWASP (https://cheatsheetseries.owasp.org/cheatsheets/Docker_Security_Cheat_Sheet.html)

deeplow commented 1 year ago

Some good insights shared by @kernelmethod on the issue https://github.com/freedomofpress/dangerzone/issues/227:

From my understanding in fedora-based systems apparmor isn't present. Instead, they have SELinux, which adds and extra component to manage, should we go this strategy. However, managing custom SELinux policies is pretty advanced and probably unmanageable for a small project like this, it seems.

Indeed -- I'm not an SELinux expert by any means, but my limited experience with it has not been great. Managing SELinux policies can be quite challenging (and that's certainly the reputation that SELinux has).

The above is just to say that we are looking to understand best how to more forward holistically from a security in-depth perspective, on what will increase most the security for everyone using Dangerzone. #225 is looking like a promising avenue as all hosts would benefit from it.

Yes, that's a completely fair consideration. While seccomp is a great mechanism and should absolutely be implemented if possible, I'll mention that it's fairly coarse-grained in ways that AppArmor is very good at covering. For instance, as soon as you allow openat you open the door for a process to read or write any arbitrary file, whereas AppArmor strictly allow-lists the files that can be read/written/executed.

Seccomp profiles also apply at a container level, rather than a process level. Since DangerZone uses a fairly complex suite of software I would imagine that this would require some fairly liberal seccomp profiles, which must subsequently be applied to all processes in the same container. By contrast, AppArmor profiles apply at a process level, and you can define different permissions for each program that runs in DangerZone's containers.

But with that said, I think that if you know more about AppArmor, some other challenges that we're not yet considering and how to get around some of the limitations raised in the above discussions, that would be appreciated.

I think that by far the biggest short-term challenge will be the aforementioned issue containers/podman#15874. Unfortunately the only immediate workarounds I can think of would be to drop Podman altogether and run Docker rootlessly, or to skip Podman and just execute the container directly using runc / crun (to which Podman acts as a frontend).

One other thing that's worth considering -- I saw that in #225 (comment), you mention that it's difficult to detect when a seccomp policy violation occurs. With AppArmor (and I believe SELinux), doing this detection is a lot easier -- policy violations automatically get written to the kernel ring buffer. For example (with a restrictive test profile called foo):
kernelmethod@debdev:~$ aa-exec -p foo -- cat /etc/passwd
cat: /etc/passwd: Permission denied

kernelmethod@debdev:~$ sudo dmesg | grep 'apparmor="DENIED"'
[ 1757.710390] audit: type=1400 audit(1689495068.794:26): apparmor="DENIED" operation="open" profile="foo" name="/etc/passwd" pid=1392 comm="cat" requested_mask="r" denied_mask="r" fsuid=1000 ouid=0

freedomofpress / dangerzone

Defense in Depth #221

Primer

Requirements

Current situation

Subtasks

Previous issues

Suggestions

1. The attacker must not gain control of a process within the container

2. The attacker must not access host data from within the container

3. The attacker must not access the network from within the container

4. The attacker must not become root within the container

5. The attacker must not escape the container

6. The attacker must not access host data outside the container

7. The attacker must not become user/root outside the container

8. The attacker must not escape the VM

Further reading