containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.04k stars 2.35k forks source link

Disable repository search #4549

Closed jasinner closed 4 years ago

jasinner commented 4 years ago

/kind bug

Description Container registry image squatting and inconsistent repository search order can lead to a user pulling an unintended image when using a short name. For example users going from Centos 7->8 will find that Quay is now ahead of Dockerhub, so while intending to pull an image using a shortname such as 'orga/myimage:mytag' will actually get the image from 'quay.io/orga/myimage:mytag before Dockerhub if an attacker has squatted the orga organisation on quay.io.

Steps to reproduce the issue:

  1. A user of Centos 7 using libpod creates an Organization ("OrgA") on Dockerhub and pushes/pulls images using the shorthand notation ("orga/myimage:mytag").

  2. Upgrade that box to Centos8 or try to pull the using the same short name on a different Centos8/RHEL8 box

Describe the results you received:

It will pull the image from quay.io/orga/myimage:mytag instead of docker.io/orga/myimage:mytag.

Describe the results you expected:

Short names don't resolve at all, and full repository names are required to ensure the image is always pulled from the correct repository.

mheon commented 4 years ago

I can see allowing this as an option, but the use of shortnames is pervasive in the container ecosystem. Disabling them entirely would be a massive breaking change, and a massive incompatibility with other tools. We can't do that by default.

A potential alternative might be to hit all search registries in parallel looking for images, and report to the user if there is more than one potential match, requiring them to explicitly decide which option to pull, but this also has serious potential for being a breaking change - a lot of major images are going to be mirrored across multiple registries (fedora:latest is available on the Docker Hub, via Quay, and via registry.fedoraproject.org, all of them seemingly legitimate, and the typical user doesn't want to be bothered to select which of those identical images to use).

There are also some simple things we can do to reduce the impact of this short of major breaking changes - making it more clear exactly what full name is being pulled, and ensuring the search registry order is identical in everything we ship.

@baude @rhatdan @mtrmac Thoughts?

rhatdan commented 4 years ago

Yes I agree with Matt. Not much we can do. If people do not want short names then you can increase the security by removing all registries from registries.conf.

We can reorder the registries from most trusted to least in each distros registries.conf

IE On RHEL start with the RHEL registry and end with Docker.io and quay.io.

mtrmac commented 4 years ago

IE On RHEL start with the RHEL registry and end with Docker.io and quay.io.

Aren’t some new product releases now published primarily on Quay.io?

mtrmac commented 4 years ago

My personal opinion is that if you know which registry the image should come from (which should, really, be always the case), just always ask for that registry explicitly. (Admittedly it’s very easy to unintentionally slip up.)

Having “Short names don’t resolve at all” would be consistent with that approach, but, as @mheon says, rather disruptive and not a plausible default at this time. (To be fair, OpenShift does effectively force explicit specification of non-docker.io registries already; and the docker:// syntax of c/image has always treated short names as docker.io-equivalents; so, it’s not entirely inconceivable. Still, there’s just too much history with the current search list on Fedora-based platforms.)

Reading the Libpod code, it does seem that setting the search list to empty should cause short names to be rejected (rather than defaulted to docker.io). Buildah mostly behaves the same, except that it tries localhost/ as well. I am unsure about the usefulness of that, but at least it does not seem very risky WRT squatting.

As far re-ordering the search list between RHEL7 and RHEL8 and changing the registries that one has to trust/worry about, that’s definitely a valid complaint, but (besides, strictly speaking, belonging to the RH support channels), without a clear consensus on registry preferences (one might very well prefer docker.io/mysql to redhat.io, but redhat.io/rhel to docker.io) I’m not optimistic about committing to a policy for managing the search order in the future that can be all of: clear, well-understood, and useful, all at the same time.

bgeesaman commented 4 years ago

As it stands today, there is very practical attack on the RH/Libpod ecosystem in terms of squatting on DockerHub in front of Quay orgs and more readily, squatting on identically named Quay orgs for hundreds/thousands of companies only living in DockerHub. Orgs that belong to many companies serving up their product images in only DockerHub are largely unaware of the risk RH ecosystem users have if they fail to use full paths in their docs and manifests.

Suggestions: ASAP:

On the next release:

On the second major release:

baude commented 4 years ago

Container runtimes like podman are designed to be used with shortnames for container images because they are interactive with users unlike runtimes that depend on YAML and the like for input. Our users expect short names to work. As @rhatdan has said, if this behavior is not desirable for users, they can remove all registries list in the registries.conf and use fully qualified image names.

For changes to RHEL products, the appropriate forum to request changes is in bugzilla and not upstream; though I suspect the answer will be generally the same.

Lastly, I dont see encouraging people to squat on repo/image names as being a tenable solution to "trust" a container image.

mtrmac commented 4 years ago

Container runtimes like podman are designed to be used with shortnames for container images because they are interactive with users unlike runtimes that depend on YAML and the like for input.

That would be a bit easier to accept if CRI-O didn’t have the same search code semantics (although OpenShift can’t actually usefully use that with credentials!), and didn’t read the same registries.conf configuration, and if OpenShift didn’t populate it.

Sure, one way to read this is a re-confirmation that the expectation is so pervasive – another way to read this is that the vulnerability is so pervasive.

rhatdan commented 4 years ago

That again is an OpenShift decision. I am not sure why OpenShift populates the file. Is this an active decision or just that by default containers-common comes with a populated file. I believe the by default Kubernetes uses long name images, and the short names are only used by crictl. crictl is mainly a debugging tool.

@mrunalp @umohnani8 @haircommander is this a correct assumption?

bgeesaman commented 4 years ago

if this behavior is not desirable for users, they can remove all registries list in the registries.conf and use fully qualified image names.

The behavior of having two “free” registries in the search order by default means there is a very simple attack vector that many/most folks in the RH ecosystem are exposed to until they purposefully configure it not to be the case. That, historically, doesn’t bode well.

Lastly, I dont see encouraging people to squat on repo/image names as being a tenable solution to "trust" a container image.

IMO, it’s not so much about image trust so much as it is protecting customer trust in their brand. Its a quick, cheap, and practical defense against org squatters, and they don’t have to mirror tags. Just owning the org is sufficient.

I realize disabling short paths is not desired for what users expect, but the design decision to have two free registries in a search order by default opens up the RH ecosystem to real attack unless they know to reconfigure their systems in a specific way. I submitted the issue to RH security and Jason kindly made this issue here, and I fear the talk of usability and user expectations is happening without a full understanding of the exposure your customers are operating under right now. Org/tag squatting is real, free, and a simple way to get lots of malicious images on RH ecosystem boxes in very short order.

bgeesaman commented 4 years ago

My concern is that the current stance on the default configuration of search order and short names allows for real attacks that are a) simple, b) cheap, and c) very damaging to you and the orgs/brands.

mtrmac commented 4 years ago

That again is an OpenShift decision. I am not sure why OpenShift populates the file. Is this an active decision or just that by default containers-common comes with a populated file.

It’s an active configuration, set up in https://github.com/openshift/machine-config-operator/blob/master/templates/worker/01-worker-container-runtime/_base/files/container-registries.yaml .

Luckily, that one “only” allows Red Hat to squat on Docker Hub references. 🤔

I believe the by default Kubernetes uses long name images, and the short names are only used by crictl. crictl is mainly a debugging tool.

Most of Kubernetes intentionally does not care about the structure of image names. If users specify a pod with a short name, that short name will be passed through all the way to CRI-O, conceptually with no modification. OTOH the Kubelet does do some parsing of that value, primarily to locate the necessary credentials for pulling that image, and that part does not (~can not) deal with ambiguous short names correctly.

The practical consequences are that:

vrothberg commented 4 years ago

If it's really such a big deal, I am very angry that we are having this discussion in public. There are proper channels both for this project and Red Hat to have a private conversation and discuss a solution or reaction. Please keep that in mind for future security issues, especially when being convinced it's a severe one and actively being exploited.

The repository https://github.com/bgeesaman/registry-search-order was created before the libpod maintainers have even been contacted on the security list, which at the least is very unfortunate. So I urge to put this entire issue into context and perspective, and to calmly evaluate the subject from now on.

For the record: the discussed problem does not exist when really caring about what's being pulled (e.g., pull-by-digest, fully-qualified names, authentication, signing, etc.). The problem only applies when pulling short names (e.g., busybox, alpine:3.6, etc.). Podman (and Buildah) will actually report which image was pulled, so this is not something the tools are hiding in any way. CRI-O can be squatted with docker.io but registry.access.redhat.com is contacted first which makes it less bad, at least.

Is squatting happening in the wild?

Yes, as shown in this issue.

How can we prevent that?

Well there are many approaches, so let's move the discussion to that.

  1. Edit /etc/containers/registries.conf to the desired search registries as explained in man containers-registries.conf.
  2. Always use fully-qualified image names, ideally pull by digest.
  3. When distributing images, sign them and push them to trusted registries.

Should search-registries be disabled?

I don't know. Many users are using this feature, so disabling it could break existing workloads. I believe the order and the registries themselves matter. Docker Hub and Quay can obviously be misused. They can also be misused for typos (e.g., repo/image:tag vs $typo/image:tag).

I never liked the pull behavior of search-registries but I liked the ability to search more than docker.io and that's something incredibly useful for users. There are more registries than Docker Hub, and users should be able to browse them and be able to configure this behavior.

bgeesaman commented 4 years ago

I disclosed to security@ on 10/27 in what I thought was a very detailed accounting of the risks, both theoretical and practical.

Now that the practical risks are actually understood, I’ll delete my comment.

mtrmac commented 4 years ago

If it's really such a big deal, I am very angry that we are having this discussion in public. There are proper channels both for this project and Red Hat to have a private conversation and discuss a solution or reaction. Please keep that in mind for future security issues, especially when being convinced it's a severe one and actively being exploited.

For another perspective, I’m personally not upset at all.

The code is public, and we’ve been discussing the same thing (although thinking of it in terms of “you don’t know which fedora image you’ll get) in public for quite some time. No vulnerability in Open Source software has ever been a secret to a determined adversary.

Overall, it’s always better to receive vulnerability reports than not to receive them, even if the method is not quite ideal. Vulnerabilities would never go away just because we discouraged the reporters.

mtrmac commented 4 years ago

Ultimately, if we decide that user input must be fully specified (and I can’t immediately see any alternative, nor any dissent to the principle of the thing), we can start today: output loud warnings on stderr every time the search code is used on a short name.

Likely enough users will be annoyed, sure, and they may well ask for another way to get the same thing done with not more typing. First of all, we can just refuse, and tell them that fully-qualified names are good for them – but that won’t go down well I suppose.

From time to time I’ve been toying with the idea of not having search, but having aliases (for repository names, not including tags/digests), configured in registries.conf. They must have a clearly different syntax (no context-dependent / configuration-dependent guessing), but that’s easy enough - e.g. require them to start with an upper-case letter, or maybe a quoted $ or whatever:

[[aliases]]
Fedora = registry.fedoraproject.org/fedora
RHEL8 = redhat.io/rhel8
PostgreSQL = # ??? Well, now what?

The downside to aliases is that we would actually have to explicitly make decisions to prioritize one registry or the other – or, if we don’t ship a default alias, no ISV should ever ship their own software that refers to aliases, or it will break if used with a different alias configuration.

Also, I’m worried that we don’t quite have the time to design and widely implement before the current situation becomes untenable.


Ultimately, though, all of this runs into a single point of contention: we can’t both suggest alias docker=podman and redirect short names (which are safe in docker pull) to the problematic search mechanism. We have to give up on one of these.

mheon commented 4 years ago

I think that, regardless of the eventual outcome of discussions about shortnames, things like aliases are a good idea to implement - though I don't know if we want to include any by default, for the reasons you gave, it seems like a good one-time-fix for sysadmins who only using a few, common images, and want to pull them from a guaranteed source or a guaranteed SHA. I could see Openshift potentially being interested in this.

I'm also still in favor of implementing an opt-in mode where decisions on which image to pull if a shortname is used and multiple images in search registries match the query, where the user will be required to manually select which image to use. This will break scripts and automation, encouraging the use of fullnames there, but preserve much more of the user-interactive experience.

mtrmac commented 4 years ago

Detecting collisions is definitely a good idea. Enabling it by default would be annoying for fedora at least, but opt-in mitigates that concern.

dbaker-rh commented 4 years ago

We can reorder the registries from most trusted to least in each distros registries.conf

IE On RHEL start with the RHEL registry and end with Docker.io and quay.io.

This is problematic. both docker.io AND quay.io must be treated as "untrusted repos". Even though quay.io is used to ship certain official Red Hat code, it also hosts untrusted internet content. Docker.io hosts "verified" registries as well as untrusted internet content. Without any co-ordination between the registries, having both untrusted repos present in the search path will always be problematic and greatly amplifies the problems of namespace landgrabs or "typosquatting".

Further, if the RHEL registry comes first we must ensure that before posting any content there, we also "landgrab" and leave empty the corresponding names on every other common public registry (thankfully, perhaps this is limited to just the other two mentioned).

Otherwise, the very act of RHEL posting a new container image can have the effect of breaking someone else's container which previously they expected to come from, say, docker.io.

The more I think about this, the only viable solution I can see at the moment is to deprecate short-names, delegating them to the historical docker hub usage, and insist/require fully qualified names on any container reference.

jasinner commented 4 years ago

Perhaps a good compromise is to remove quay.io from the default registry search order, and leave only Red Hat registries which don't allow the general public to upload images. That greatly reduces the chances of typosquatting on quay.io, and still matches the behaviour of Docker.

jasinner commented 4 years ago

Raised 2 RHEL hardening bugs to remove quay.io from the default registry search list.

github-actions[bot] commented 4 years ago

A friendly reminder that this issue had no activity for 30 days.

rhatdan commented 4 years ago

The quay.io has been disabled so closing this issue.

fatherlinux commented 4 years ago

So, I have to chime in on this ticket. We are enumerating tons of edge case scenarios. I don't disagree with any of these in "theory" but NONE of them have been proven in public.

The fact that there are download stats for the typo-squated land grabs does NOT prove that real people are getting the wrong image. This could be nothing more than scanners downloading and analyzing images. TONS of image scanning vendors do this as part of delivering value to their customers. This would actually be a WAY more likely explanation than people are downloading the wrong image without knowing it.

Stated another way, I have NEVER seen a single ticket opened where a customer has mistakenly gotten a Red Hat image instead of a Docker.io image, nor an instance where a customer has gotten a Docker.io image when they wanted a Red Hat one. This is further mitigated by the fact that we always recommend using FQDNs for images. There are so many more elegant ways of solving this problem than just removing quay.io and docker.io.

Finally, making this change is terrifying because it is inevitably going to break OpenShift and RHEL customers when it lands downstream. This is like a ticking time bomb.

mrguitar commented 4 years ago

This needs more review. I agree w/ the sentiment of the issue, but calling those two bugs "hardening" is, at best, a stretch.

fatherlinux commented 4 years ago

I should add though, in theory, I don't disagree with this problem and have ranted about short names for years. There are no solid rules for how they resolve like DNS. I have a lab that even demonstrates this problem 1.

That said, this breaks the sentiment of ABI/API compatibility in RHEL, as well as our policy for 6 years. I think targeting RHEL 9 for this change might be reasonable, but not RHEL 8. People need time to absorb inconveniences which this will cause. We also need to update docs to include instructions on how to add quay.io and docker.io back in if users want them.

dbaker-rh commented 4 years ago

Scott:

The fact that there are download stats for the typo-squated land grabs does NOT prove that real people are getting the wrong image.

This issue was created due to real people getting the wrong image.

bgeesaman commented 4 years ago

Can we prove that zero real users were affected? They were only squatted for ~90minutes. When the download stats started going up much quicker than others for certain images, I took them down.

From my perspective, the attack has been proven to be cost-effective, practical, and stealthy, and we’re fortunate that it’s not being actively exploited—mostly due to it not being widely known.

Should we discuss the edge cases and elegant solutions further?

mtrmac commented 4 years ago

Discussing whether users have been affected in the past is not productive; that’s a discussion that makes sense after an incident, when trying to assign blame/liability, in some contractual relationship I suppose.

As engineers on an Open Source project, it’s our responsibility to design software so that, going forward, users are not going to be put at risk. It’s been amply demonstrated that search behavior, and the way the feature is advertised or (possibly unknowingly) its use is propagated by copy&pasting from examples, clearly does put users at risk.

We don’t keep sharp knives laying on floors arguing that things are fine until someone actually comes forward and proves (with a high burden of proof, even) to have been cut; this is no different.

Sure, we might need to loudly tell users that short names will break and that they need to change their commands (NOT their configuration to revert to the insecure one).


This is also not the place to discuss RHEL concerns. This GitHub repository is not RHEL, and does not have a 6-year stability policy — while it does have at least some amount of responsibility not to put users at risk. I feel very strongly that the master branch of this repository, and any future release, must not put users at known and unnecessary risks.

RHEL concerns should be discussed in RHEL avenues — not just because this is a place for cross-vendor collaboration, but also because RH might have other options for keeping compatibility without putting users at risk, e.g. by putting a much tighter control on Quay.io repository naming to avoid squatting, that have nothing to do with this GitHub repository.

RHEL also has separate documentation/communication channels for users to advise users on this and how to make an appropriate stability/security/other mitigations decision.