k8sallowedreposv2: Fix Security Bypass for Image Pulling

yakirk commented 1 month ago

This is the PR containing the updates from https://github.com/open-policy-agent/gatekeeper-library/pull/529 What this PR does / why we need it: This PR fixes security issues in the current "Allowed Repositories" (k8sallowedrepos) rule that can be bypassed in several ways. The previous rule only checked for a suffix match, so possible scenarios were:

The user defined allowing image downloads from a registry like "myregistry.io" and didn't add a '/' at the end. So, the rule could be bypassed by establishing a malicious registry under myregistry.io.malicious.com/malicious-image .
The user defined allowing image downloads from the repository of a specific username on docker.io, for example docker.io/mycompany. The rule could be bypassed by establishing a malicious repository on Docker Hub like docker.io/mycompanyevil/malicious-image. This is why when a user adds a constraint to pull images from specific repositories or registries, the rule appends a '/' to the end if it's not already present. The PR also provides the option to allow only specific image names (because they require special checking). For example, in the previous rule, since the rule only checked for a suffix match, if the user defined to allow downloading only "ubuntu", it was possible to download images like "ubuntu-evil" and basically any image name that starts with the same prefix. This PR enhances security by preventing potential bypasses of the existing rule.

In this policy we distinguish between repos/registries to images - If we don't distinguish between this, this opens the ground for bypasses and security issues, which I will explain: Imagine scenarios where a user adds a constraint to allow only images from the organization "openpolicyagent" and defines the policy without a slash ("openpolicyagent"). In this case, we want to add a "/" to ensure that the pulled images are from Docker Hub and only from the repos of the user "openpolicyagent," not a different user. But lets think about the current case - if we define a constraint to allow only the "ubuntu" image, we cannot add "/", because it will break the pulling. However, if we keep using the prefix approach, an attacker can create a username on Docker Hub like "ubuntumyusername123/evil-image" and download any image they wish. This is one of the reasons why it is important to separate the cases of registry/repository and image name. Furthermore, for the image name, we need to check if the next character doesn't exist or if it contains ":" or "@".

yakirk commented 1 month ago

Hey, do I need to fix this? I'm struggling to understand what the issue is from the ci output. The generate of web documents/artifacthub/templates worked well in my testing

maxsmythe commented 1 month ago

I'm guessing <digest> is being interpreted as an HTML tag. You may want to switch to notation that cannot be parsed as HTML?

Also, v2 should not replace v1, but be a new template in addition to v1. This is to avoid breaking users relying on the old version.

In practice this means copying the v1 directory into a directory with a v2 suffix, which should be sufficient.

Responding to your comments:

The previous rule only checked for a suffix match, so possible scenarios were:

It's important to note that the previous rule required a prefix match, suffix match would be logically incorrect. Also, suffix/prefix are used inconsistently in the Rego source code of this change. That would need to be fixed.

The user defined allowing image downloads from a registry like "myregistry.io" and didn't add a '/' at the end. So, the rule could be bypassed by establishing a malicious registry under myregistry.io.malicious.com/malicious-image .

Correct, however this could also be fixed by the user adding a slash to their prefix. We should arguably change the parameter name from repos to allowedPrefixes or similar to make it more clear how the value is being used to put the user on notice. Adding guidance in the doc string and improving examples may also be helpful.

The user defined allowing image downloads from the repository of a specific username on docker.io, for example docker.io/mycompany. The rule could be bypassed by establishing a malicious repository on Docker Hub like docker.io/mycompanyevil/malicious-image.

This seems like an equivalent concern to above, with an equivalent fix.

This is why when a user adds a constraint to pull images from specific repositories or registries, the rule appends a '/' to the end if it's not already present.

Understood that that is what this PR currently does, but the question is should this PR do that, or merely make the logic more clear? One concern about being too fancy is that we grow code complexity and imply an understanding of the docker spec that the code does not have (which was the root of the original issue you highlighted. Here is the code it takes to truly semantically parse an image spec).

Docker image paths are not super well-defined, so I don't want to assume we will be able to cover all edge cases. For instance, did you know that tag and digest can be specified simultaneously (though this is inconsistently supported)?

Suffix format is not directly on point in this example, except that we assume we know all valid separators. Saying "we're just checking a prefix you provide" makes it clear what our logic is doing and will never bitrot (it's tied to our logic, not the image name spec)

The PR also provides the option to allow only specific image names (because they require special checking). For example, in the previous rule, since the rule only checked for a suffix match, if the user defined to allow downloading only "ubuntu", it was possible to download images like "ubuntu-evil" and basically any image name that starts with the same prefix.

Extra functionality (exact matching) seems like a side issue that I'd defer until there is demand. I'm not sure if there are people who have a strong opinion about which repository to use but are okay with unqualified images?

Maintaining prefix checking is compatible with adding these other checks later, if there is need.

This PR enhances security by preventing potential bypasses of the existing rule.

We are not arguing whether something can be improved, but how it should be improved.

In this policy we distinguish between repos/registries to images - If we don't distinguish between this, this opens the ground for bypasses and security issues, which I will explain: Imagine scenarios where a user adds a constraint to allow only images from the organization "openpolicyagent" and defines the policy without a slash ("openpolicyagent"). In this case, we want to add a "/" to ensure that the pulled images are from Docker Hub and only from the repos of the user "openpolicyagent," not a different user. But lets think about the current case - if we define a constraint to allow only the "ubuntu" image, we cannot add "/", because it will break the pulling. However, if we keep using the prefix approach, an attacker can create a username on Docker Hub like "ubuntumyusername123/evil-image" and download any image they wish. This is one of the reasons why it is important to separate the cases of registry/repository and image name. Furthermore, for the image name, we need to check if the next character doesn't exist or if it contains ":" or "@".

This paragraph seems to duplicate points already raised?

yakirk commented 4 weeks ago

Hey, @maxsmythe, Thanks for the answer!

I will take care of the parse. You were right, I made a typo by mentioning 'suffix' instead of 'prefix' in the comments.

I am confused about the answer. What should I do with this PR? Should I open a new rule (in a new folder, etc.) for those interested in a more strict policy, improve the documentation of the current rule, and improve the parameter name of the current rule, or just close it?

If we go with this approach, there will still be scenario that can be bypassed. For example, if a user lists an image like 'ubuntu' (without mentioning the ':' or '@' character), which we have seen in some constraints yaml, it will be possible to create a malicious image like ubuntumyusername/malicious_image on DockerHub.

Looking forward to your update. Yakir

maxsmythe commented 3 weeks ago

Hi @yakirk,

Thank you for the PR!

Sorry for the unclear call-to-action. If we agree, I think there are a few things:

It looks like repos is correctly documented for the v1 object in that it says "The list of prefixes a container image is allowed to have"
We can cp -r allowedrepos allowedreposv2 (it looks like we didn't do this for selinux. Maybe we should have, but selinux constraint template may also predate the library)
It sounds like you have evidence of times when exact matching would be helpful.

I think we have some options for what to change and how.

If you like the exemptImages use of glob-like syntax

We could re-use that syntax here (maybe call the parameter allowedImages)

We could also re-use the logic that evaluates it:

https://github.com/open-policy-agent/gatekeeper-library/blob/71eacda9a93255680b4b0a4eadfb95c97b26af2e/library/pod-security-policy/seccomp/template.yaml#L265-L281

OR

We could rename the repos parameter allowedPrefixes and add an allowedExactMatches parameter (I hate the allowedExactMatches name, but cant think of a better one right now).

We could also extend V1 with either edit, but then the misleading parameter name is still present.

@ritazh @sozercan @JaydipGabani (one of whom would also need to approve the change) Any preferences?

yakirk commented 3 weeks ago

Hey, Thanks for the answer.

I don't want to overcomplicate things or the solution. It's also possible to just edit the documentation for the v1 rule and mention the risks if users don't declare the existing prefixes they want to enforce with this policy. For example, if you want to pull an image from a specific domain, you must add a "/" at the end. Or if you want to use a specific repository in Dockerhub, you must also add a "/" at the end (otherwise, it is possible to open a "malicious" username with the same prefix) to help users understand the rule and its risks.

As you mentioned, it is also possible to add a v2 rule (though it's not mandatory), but it would be good for people who want a stricter policy. Regarding the new rule, it doesn't matter if we use a glob-like syntax or "allowedPrefixes" and "allowedExactMatches" - in any case, we must separate between the registry/repository and the image name because they behave differently, and I'm not sure that two parameters like "allowedExactMatches" and "allowedPrefixes" can solve this or glob-like syntax.

maxsmythe commented 3 weeks ago

No objections to improving the comment string more, nor adding additional parameters to the current constraint.

Regarding the new rule, it doesn't matter if we use a glob-like syntax or "allowedPrefixes" and "allowedExactMatches" - in any case, we must separate between the registry/repository and the image name because they behave differently, and I'm not sure that two parameters like "allowedExactMatches" and "allowedPrefixes" can solve this or glob-like syntax.

I'm not sure exactly what you're saying here, but it sounds like you are either not sure how to compare the options or arguing maybe we need something more specific than glob or exact/prefix matching?

As to how exact/prefix vs. glob-style compare... The two options are logically identical, it's more a question of UX and which is easier to use/explain.

If you were suggesting we need more nuanced behavior...

exact-match behavior is needed for strings like "ubuntu" , as you point out.

Cases like "ubuntu:*" could be satisfied via a prefix match.

Here is another attack. Let's say someone creates a "myregistry.com" image that a user wants to allow (a bad idea, but possible).

Using this PR as-written, a user would add "myregistry.com" to the allowed images parameter. A malicious user could exploit the template logic by pretending a port is a tag (myregistry.com:80/untrusted/something-sinister would satisfy the image check).

By being explicit about what our logic is doing and not implicitly suggesting the logic is parsing things more semantically than is possible, users are more likely to spot this edge case and avoid it.

To use the glob example: myregistry.com would be understood to be an exact match. myregistry.com:* (what a user would specify if they wanted to allow custom tags) should immediately raise flags if the host-like shape of the image name hasn't already.

yakirk commented 3 weeks ago

Hey Max, Thanks for the clarification—I understand now. I'll enhance the documentation for the v1 rule and also create a new allowedreposv2 role that uses glob-style comparison for those who want to define stricter policies. Does that sound good? It might take me some time. Thanks!

JaydipGabani commented 3 weeks ago

@maxsmythe @yakirk Thanks for the detailed discussions. I like the idea of v2 policy with updated field names better than extenteding v1.

maxsmythe commented 3 weeks ago

Thanks for the discussion and the fix!

open-policy-agent / gatekeeper-library

k8sallowedreposv2: Fix Security Bypass for Image Pulling #538