Azure / AKS

Azure Kubernetes Service
https://azure.github.io/AKS/
1.93k stars 294 forks source link

CVE-2024-21626: container breakout through process.cwd trickery and leaked fds #4080

Closed miwithro closed 4 months ago

miwithro commented 5 months ago

https://github.com/opencontainers/runc/security/advisories/GHSA-xr7r-f8xq-vfvv

Summary

In runc 1.1.11 and earlier, due to an internal file descriptor leak, an attacker could cause a newly-spawned container process (from runc exec) to have a working directory in the host filesystem namespace, allowing for a container escape by giving access to the host filesystem ("attack 2"). The same attack could be used by a malicious image to allow a container process to gain access to the host filesystem through runc run ("attack 1"). Variants of attacks 1 and 2 could be also be used to overwrite semi-arbitrary host binaries, allowing for complete container escapes ("attack 3a" and "attack 3b").

Strictly speaking, while attack 3a is the most severe from a CVSS perspective, attacks 2 and 3b are arguably more dangerous in practice because they allow for a breakout from inside a container as opposed to requiring a user execute a malicious image. The reason attacks 1 and 3a are scored higher is because being able to socially engineer users is treated as a given for UI:R vectors, despite attacks 2 and 3b requiring far more minimal user interaction (just reasonable runc exec operations on a container the attacker has access to). In any case, all four attacks can lead to full control of the host system.

Patches runc 1.1.12 has been released, and includes patches for this issue. Note that there are four separate fixes applied:

Checking that the working directory is actually inside the container by checking whether os.Getwd returns ENOENT (Linux provides a way of detecting if cwd is outside the current namespace root). This explicitly blocks runc from executing a container process when inside a non-container path and thus eliminates attacks 1 and 2 even in the case of fd leaks. Close all internal runc file descriptors in the final stage of runc init, right before execve. This ensures that internal file descriptors cannot be used as an argument to execve and thus eliminates attacks 3a and 3b, even in the case of fd leaks. This requires hooking into some Go runtime internals to make sure we don't close critical Go internal file descriptors. Fixing the specific fd leaks that made these bug exploitable (mark /sys/fs/cgroup as O_CLOEXEC and backport a fix for some *os.File leaks). In order to protect against future runc init file descriptor leaks, mark all non-stdio files as O_CLOEXEC before executing runc init.

Attack Details

Attack 1: process.cwd "mis-configuration" In runc 1.1.11 and earlier, several file descriptors were inadvertently leaked internally within runc into runc init, including a handle to the host's /sys/fs/cgroup (this leak was added in v1.0.0-rc93). If the container was configured to have process.cwd set to /proc/self/fd/7/ (the actual fd can change depending on file opening order in runc), the resulting pid1 process will have a working directory in the host mount namespace and thus the spawned process can access the entire host filesystem. This alone is not an exploit against runc, however a malicious image could make any innocuous-looking non-/ path a symlink to /proc/self/fd/7/ and thus trick a user into starting a container whose binary has access to the host filesystem.

Furthermore, prior to runc 1.1.12, runc also did not verify that the final working directory was inside the container's mount namespace after calling chdir(2) (as we have already joined the container namespace, it was incorrectly assumed there would be no way to chdir outside the container after pivot_root(2)).

The CVSS score for this attack is CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:C/C:H/I:H/A:N (8.2, high severity).

Note that this attack requires a privileged user to be tricked into running a malicious container image. It should be noted that when using higher-level runtimes (such as Docker or Kubernetes), this exploit can be considered critical as it can be done remotely by anyone with the rights to start a container image (and can be exploited from within Dockerfiles using ONBUILD in the case of Docker).

Attack 2: runc exec container breakout (This is a modification of attack 1, constructed to allow for a process inside a container to break out.)

The same fd leak and lack of verification of the working directory in attack 1 also apply to runc exec. If a malicious process inside the container knows that some administrative process will call runc exec with the --cwd argument and a given path, in most cases they can replace that path with a symlink to /proc/self/fd/7/. Once the container process has executed the container binary, PR_SET_DUMPABLE protections no longer apply and the attacker can open /proc/$exec_pid/cwd to get access to the host filesystem.

runc exec defaults to a cwd of / (which cannot be replaced with a symlink), so this attack depends on the attacker getting a user (or some administrative process) to use --cwd and figuring out what path the target working directory is. Note that if the target working directory is a parent of the program binary being executed, the attacker might be unable to replace the path with a symlink (the execve will fail in most cases, unless the host filesystem layout specifically matches the container layout in specific ways and the attacker knows which binary the runc exec is executing).

The CVSS score for this attack is CVSS:3.1/AV:L/AC:H/PR:L/UI:R/S:C/C:H/I:H/A:N (7.2, high severity).

Attacks 3a and 3b: process.args host binary overwrite attack (These are modifications of attacks 1 and 2, constructed to overwrite a host binary by using execve to bring a magic-link reference into the container.)

Attacks 1 and 2 can be adapted to overwrite a host binary by using a path like /proc/self/fd/7/../../../bin/bash as the process.args binary argument, causing a host binary to be executed by a container process. The /proc/$pid/exe handle can then be used to overwrite the host binary, as seen in https://github.com/advisories/GHSA-gxmr-w5mj-v8hh (note that the same #! trick can be used to avoid detection as an attacker). As the overwritten binary could be something like /bin/bash, as soon as a privileged user executes the target binary on the host, the attacker can pivot to gain full access to the host.

For the purposes of CVSS scoring:

Attack 3a is attack 1 but adapted to overwrite a host binary, where a malicious image is set up to execute /proc/self/fd/7/../../../bin/bash and run a shell script that overwrites /proc/self/exe, overwriting the host copy of /bin/bash. The CVSS score for this attack is CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:C/C:H/I:H/A:H (8.6, high severity). Attack 3b is attack 2 but adapted to overwrite a host binary, where the malicious container process overwrites all of the possible runc exec target binaries inside the container (such as /bin/bash) such that a host target binary is executed and then the container process opens /proc/$pid/exe to get access to the host binary and overwrite it. The CVSS score for this attack is CVSS:3.1/AV:L/AC:L/PR:L/UI:R/S:C/C:H/I:H/A:H (8.2, high severity). As mentioned in attack 1, while 3b is scored lower it is more dangerous in practice as it doesn't require a user to run a malicious image.

Am I vulnerable?

Ubuntu 22.04 and Azure Linux use runc and are thus vulnerable to this CVE. Windows does not leverage runc and is not vulnerable.

AKS Information:

Update your Ubuntu 22.04 node image to at least 202401.17.1 which is running runc 1.1.9-2 that contains the backported fix.

Update your Azure Linux node image to at least 202401.17.2 which is running runc 1.1.9-4 that contains the backported fix.

PixelRobots commented 5 months ago

Do we know when this node image will start to be released?

stevehipwell commented 5 months ago

@miwithro it doesn't look like Ubuntu 202401.17.1 is even showing up on the AKS Release Status yet? Do you have an expected rollout window for full region coverage?

humpalu commented 5 months ago

Waiting for it in West Europe Region 👯‍♀️

yylai commented 5 months ago

I am not sure if 202401.17.1 patches the issue.

According to https://snyk.io/blog/leaky-vessels-docker-runc-container-breakout-vulnerabilities/, the patch for the vulnerability is in runc v1.1.12, released on Jan 31st.

but ubuntu 202401.17.1 has runc version 1.1.9-2, and that vhd was released on Mon Jan 22. https://github.com/Azure/AgentBaker/blob/master/vhdbuilder/release-notes/AKSUbuntu/gen2/2204containerd/202401.17.1.txt and moby-runc/now 1.1.9-ubuntu22.04u2 amd64 [installed,upgradable to: 1.1.11-ubuntu22.04u1]|

Or am I missing something?

miwithro commented 5 months ago

@yylai I updated the text in the issue so it is more clear.

Update your Ubuntu 22.04 node image to at least 202401.17.1 which is running runc 1.1.9-2 that contains the backported fix.

miwithro commented 5 months ago

@PixelRobots @humpalu @stevehipwell we kicked off the deployment yesterday, it should be visible in the Release Tracker in the next day or so.

cpuguy83 commented 5 months ago

moby-runc version 1.1.9-2 (1.1.9-ubuntu22.04u2) and 1.1.11-2 (1.1.11-ubuntu22.04u2) both contain backported versions of the necessary patches.

floriankoch commented 5 months ago

Whats with Azure Linux?

cpuguy83 commented 5 months ago

For Auzre Linux the version you want is 1.1.9-4.

floriankoch commented 5 months ago

Sorry my question was not clear enough. What I want to know is when the fix is in Azure Linux and a new image is rolled out

jperrin commented 5 months ago

The fixed package is already available in Azure Linux. When the new image will be rolled out is controlled by the AKS team. I can't answer that portion of the question.

adhurwit commented 5 months ago

Can you please clarify if there is a multitenancy risk here for AKS?

zhangguanzhang commented 5 months ago

Steps to reproduce https://github.com/zhangguanzhang/CVE-2024-21626

miwithro commented 5 months ago

@adhurwit and so everyone is on the same page. When the customer is referring to the multitenancy they are referring to the underlying physical host that AKS nodes are running on top of and this attack getting to that layer. When this issue references "host" in the terms of AKS it is the Virtual Host.

miwithro commented 5 months ago

@floriankoch the fix for Azure Linux is now rolling out.

https://github.com/Azure/AgentBaker/blob/master/vhdbuilder/release-notes/AKSCBLMarinerV2/gen2/202401.17.2.txt

UpcLeo commented 5 months ago

For Auzre Linux the version you want is 1.1.9-4.

Whats with Ubuntu 18.04? I have seen its runc version is 1.1.7.

krishnan-r commented 5 months ago

Hi Team,

Thank you for your response.

Do we know when the Ubuntu Linux 202401.17.1 image would be available for the West Europe regions? I see that at the moment it is rolled out only to West Central US on https://releases.aks.azure.com/#tabus

ltoinel commented 5 months ago

20240109 after upgrade in France Central. The https://releases.aks.azure.com/ seems not to be up to date with the real release available ?

joeldq commented 5 months ago

For Auzre Linux the version you want is 1.1.9-4.

Whats with Ubuntu 18.04? I have seen its runc version is 1.1.7.

+1 news on this?

ygao-armada commented 5 months ago

How about Ubuntu 20.04.6 LTS with runc 1.1.7 ?

PixelRobots commented 5 months ago

How about Ubuntu 20.04.6 LTS with runc 1.1.7 ?

AKS does not support 20.04.

pauska commented 5 months ago

Would it be possible to expedite this OS release for all regions? West Europe still doesn't have this available, and it's starting to get a bit urgent..

sata-sa commented 5 months ago

For Auzre Linux the version you want is 1.1.9-4.

Whats with Ubuntu 18.04? I have seen its runc version is 1.1.7.

+1

philwelz commented 5 months ago

Would it be possible to expedite this OS release for all regions? West Europe still doesn't have this available, and it's starting to get a bit urgent..

Created a cluster today with node image AKSUbuntu-2204gen2containerd-202401.17.1. So it is available in West Europe.

goldjg commented 5 months ago

This needs more than just a backport to moby/runc.

Our CNAPP platform is identifying Azure Kubernetes Service Nodes as vulnerable to CVE-2024-21626 even after they are upgraded to the latest image from Microsoft which has a backported fix.

Our CNAPP has the impacted packages as follows:

github.com/opencontainers/runc:v1.1.0
github.com/opencontainers/runc:v1.1.5
github.com/opencontainers/runc:v1.1.4
github.com/opencontainers/runc:v1.0.1
github.com/opencontainers/runc:v1.1.7
github.com/opencontainers/runc:v1.1.3
github.com/opencontainers/runc:v1.0.3
github.com/opencontainers/runc:v1.0.2
github.com/opencontainers/runc:v1.1.6
github.com/opencontainers/runc:v1.1.9
runc:1.1.0-0ubuntu1~20.04.1
runc:1.0.1-0ubuntu2~20.04.1
github.com/opencontainers/runc:v1.0.0-rc95
runc:1.1.7-0ubuntu1~20.04.1
github.com/opencontainers/runc:v1.1.2
runc:1.1.4-0ubuntu1~20.04.1
runc:1.1.7-0ubuntu1~22.04.1
runc:1.0.0~rc95-0ubuntu1~20.04.2
runc:1.0.0~rc10-0ubuntu1
runc:1.0.1-0ubuntu2~18.04.1
runc:1.1.0-0ubuntu1~20.04.2
runc:1.1.4-0ubuntu1~20.04.3

However, you have backported the fix to v1.1.9 in the AKS images.

Our CNAPP shows the following package information for one of our upgraded hosts: package moby-runc 1.1.9-ubuntu22.04u2

However the package causing the CNAPP to say it's vulnerable is: go github.com/opencontainers/runc /usr/bin/containerd v1.1.5 go github.com/opencontainers/runc /usr/local/bin/kubelet v1.1.6

So I think you need to update that too?

cpuguy83 commented 5 months ago

containerd and kubelet are not vulnerable. They import certain things the runc go packages but they are not vulnerable.

goldjg commented 5 months ago

containerd and kubelet are not vulnerable.

They import certain things the runc go packages but they are not vulnerable.

Have passed this on to the CNAPP vendor to check - thanks.

goldjg commented 5 months ago

containerd and kubelet are not vulnerable. They import certain things the runc go packages but they are not vulnerable.

Have had this response from Palo Alto (the CNAPP vendor):

I see the below detections for CVE-2024-21626: Hostname: REDACTED Type: go Package: github.com/opencontainers/runc v1.1.6 Package path: /usr/local/bin/kubelet Fix version: fixed in 1.1.12 Vulnerability Link: https://nvd.nist.gov/vuln/detail/CVE-2024-21626

I do not see the moby-runc package being flagged for this CVE and the host details package info tab of the attached .docx file reflects that it is properly detected as v1.1.9-ubuntu22.04u2. I also see it reflects type=package rather than type=go.

For known vulnerabilities with a CVE, we rely on the most authoritative source - for OS packages (packages that are maintained by the OS vendor, marked as type "package" in Compute), the CVE details are taken from the specific vendor feed. https://docs.prismacloud.io/en/enterprise-edition/content-collections/runtime-security/vulnerability-management/troubleshoot-vulnerability-detection

The github.com/opencontainers/runc are go type packages associated with the /usr/local/bin/kubelet path. They are correlated to the NVD configurations. https://nvd.nist.gov/vuln/detail/CVE-2024-21626

The current vulnerability scanning process detects the Go version that was used to compile the scanned binary, and reports it. These kind of vulnerabilities are associated with the Go standard library (and not a vulnerability in the Go compiler itself) - so a binary that was compiled using an affected version is potentially vulnerable and reported.

Lastly I wanted to note that we have internal ticket open that will enhance the above go detection process. It is currently expected to be included in the release planned for near the end of this month.

I'm curious what you think of the response - though they are going to amend their detection process, they do still think this image is vulnerable...

cpuguy83 commented 5 months ago

The code paths required to exercise the vulnerabilities are not part of the library code used in these projects. The runc binary is vulnerable because it is the component responsible for setting up the container sandbox and executing the container image.

Projects like containerd and kubelet import certain pieces of code from the runc project for a few things such as read cgroup paths but they are never executing container workloads directly which is where this issue occurs.

goldjg commented 5 months ago

Thanks for the context - Palo are saying: "Our vulnerability scanning process flags it as it could potentially be vulnerable based on the package version detected and what NVD reflects. Based on the shared thread, it appears we have confirmed that this package is not truly vulnerable to CVE-2024-21626. Our suggestion would be to add an exception in accordance with the Rule Exceptions portion of the below documentation: https://docs.prismacloud.io/en/enterprise-edition/content-collections/runtime-security/vulnerability-management/vulnerability-management-policies%23undefined I also found the below Palo Alto Networks resource that has some information regarding this CVE during my research: https://www.paloaltonetworks.com/blog/prisma-cloud/leaky-vessels-vulnerabilities-container-escape/"

So - I think we are agreed that with the backported fix, we are no longer vulnerable - but the Prisma Cloud reporting could be better cos it's gone from "definitely vuln" to "could, maybe, be vuln" but Prisma doesn't make that distinction in the tool.

microsoft-github-policy-service[bot] commented 4 months ago

Thanks for reaching out. I'm closing this issue as it was marked with "Fix released" and it hasn't had activity for 7 days.

iamvighnesh commented 1 week ago

containerd and kubelet are not vulnerable. They import certain things the runc go packages but they are not vulnerable.

@cpuguy83 I suppose same applies for kube-proxy?

cpuguy83 commented 1 week ago

@iamvighnesh I don't see any runc imports in kube-proxy. In all of k/k there are a few places that import runc, and for all the imports there are 2 libraries used:

  1. cgroups
  2. check if running in userns

Neither of these are affected by the CVE.