Closed rosenhouse closed 6 months ago
OPA might also be a way to solve this. If something restricts only allowing a single root to exist within the cluster, there wouldn't be any changes to make to Contour.
Wow, good pointer! That is, in fact, an example in their documentation which points at this policy.
Would it make sense then for Contour to document an equivalent policy for the HTTPProxy resource?
Ok, so after looking at this a bit, and a pointer from @cppforlife, I'm concerned that this usage of the OPA Admission Controller (as documented in my previous comment) would suffer from a race-condition, in which two conflicting ingress resources that are submitted simultaneously could both be accepted.
The OPA maintainers seem to acknowledge this risk in a brief slack chat we had. I opened a PR to the OPA docs to call this out.
I don't know how likely this issue is to show up in practice.
I don't know how likely this issue is to show up in practice.
In a game vs nature, it will show up rarely. In a game against motivated attackers, it will show up pretty quickly.
Additional info from external email thread:
In OPA Slack, I (@rosenhouse) asked:
I'm looking at the docs where there's an example about preventing two Ingress resources from having the same hostname... ... It seems that this depends on the OPA agent maintaining a fully-consistent, global view of all Ingress resources across the cluster, and reading from that view synchronously when enforcing the policy. What happens if two resources with the same hostname are submitted to the apiserver simultaneously? Is the OPA policy agent a singleton that serializes the requests somehow?
Patrick East replied:
OPA will process requests in parallel. The data it uses to evaluate is, as @Ash mentioned eventually consistent. So in the case you outlined I believe it could allow in the second ingress that is invalid. I'd be curious to know if gatekeeper has similar issues with policies that depend on external data.
So the race happens when 2 Ingress objects are posted and OPA evaluates them on different threads. Neither object is visible to the other's evaluation, so both are accepted. To me, this seems like a congenital problem for admission controllers that can only be solved by serializing the evaluation (and accepting the associated drawbacks).
seems like a congenital problem for admission controllers
for admission controllers that rely on global state. For basic controllers that are simply validating contents of the single resource being admitted, it shouldn't be a problem.
that can only be solved by serializing the evaluation
yes, and by that singleton maintaining its own cache of the global state, or by fetching all the global state on every request it is validating. Maintaining the cache sounds better for performance. But deletes don't go through admission controllers, so it needs to be able to sync and invalidate in the background. Ugh.
I think there's a tension between wanting to ensure that people can't clobber other people's config (which was one of the design pillars for HTTPProxy), and wanting the system to be entirely self-service.
When we designed HTTPProxy, the thing we were worried about was, effectively security. Security of the TLS keypair for a domain name, which we assumed would need to be controlled to some extent ( the TLSDelegation resource), and security of the domain name itself, to prevent accidental configuration clobbering.
One of the problems with the Ingress object is about ordering - it's not really defined which order an ingress controller will process resources, so you can get odd behaviour when there is overlapping configuration.
Part of HTTPProxy's design was about avoiding that, which is why we ended up with a difference between the root HTTPProxy and child HTTPProxys, and the --root-namespaces
flag.
I think that the problem is solvable, but needs careful thought because of the distributed, eventually-consistent nature of the Kubernetes API. Implementing first-writer-wins semantics for FQDNs will require, effectively, a distributed lock on that domain name to avoid the problems listed above.
It's possible that could be done with an annotation (like in leader election), or use the new Lease object if it's available (only since 1.16, I think). However, that violates one of the internal rules we have when building Contour's data model about not changing the Kubernetes cache. So, again, careful thought required.
To echo what Nick wrote, the design of the —root-namespaces feature is intended to give an administrator control over which HTTPProxy document gloms onto which vhost.
The pattern whereby contour is restricted to looking for roots in a specific set of namespaces enables the properties of the vhost — which is really just hostname and tls details, from route and tcpproxy parameters — is that someone (I’ll come back to this) places a stub httpproxy document in the root namespace and delegates all routes to a document in a different namespace. This gives contour a property which k8s cannot guarentee by itself; that a document in a name/namespace as sole ownership of the routes for a vhost.
Rather than OPA or writing an admission controller, I think it would be more straight forward, given that in your environment you know the ownership of each vhost, to write a pair of httpproxy records, one into the root namespace delegating to the user’ namespace, the other into the user namespace. Perhaps the second is optional, you might just tell users that they need to create HTTPProxy document at a specific name/namespace and it will be connected to a specific vhost.
Thanks
Dave
On 10 Feb 2020, at 12:08 pm, Nick Young notifications@github.com wrote:
I think there's a tension between wanting to ensure that people can't clobber other people's config (which was one of the design pillars for HTTPProxy), and wanting the system to be entirely self-service.
When we designed HTTPProxy, the thing we were worried about was, effectively security. Security of the TLS keypair for a domain name, which we assumed would need to be controlled to some extent ( the TLSDelegation resource), and security of the domain name itself, to prevent accidental configuration clobbering.
One of the problems with the Ingress object is about ordering - it's not really defined which order an ingress controller will process resources, so you can get odd behaviour when there is overlapping configuration.
Part of HTTPProxy's design was about avoiding that, which is why we ended up with a difference between the root HTTPProxy and child HTTPProxys, and the --root-namespaces flag.
I think that the problem is solvable, but needs careful thought because of the distributed, eventually-consistent nature of the Kubernetes API. Implementing first-writer-wins semantics for FQDNs will require, effectively, a distributed lock on that domain name to avoid the problems listed above.
It's possible that could be done with an annotation (like in leader election), or use the new Lease object if it's available (only since 1.16, I think). However, that violates one of the internal rules we have when building Contour's data model about not changing the Kubernetes cache. So, again, careful thought required.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.
given that in your environment you know the ownership of each vhost
But I don't know.
The cluster operator doesn't know, and doesn't want to be bothered with the details of which vhosts are claimed by which namespace users. They just want to set up the shared ingress proxy once, and get out of the way.
They just want to set up the shared ingress proxy once, and get out of the way.
Well, if they want to set up ingress as a global resource and let their users fight it out in a cluster maybe that'll work, maybe it won't. What is clear in the market is k8s clusters are getting smaller, so maybe this approach will work. If the cluster operator is trying to offer a k8s cluster as a shared heterogeneous resource, then probably some extra tooling on top of contour will be required.
The Contour project currently lacks enough contributors to adequately respond to all Issues.
This bot triages Issues according to the following rules:
You can:
Please send feedback to the #contour channel in the Kubernetes Slack
The Contour project currently lacks enough contributors to adequately respond to all Issues.
This bot triages Issues according to the following rules:
You can:
Please send feedback to the #contour channel in the Kubernetes Slack
problem statement
As a cluster operator, I want to enable namespace admins to "claim" a root hostname for themselves in a fully self-service way, without my involvement. I also want to prevent them from creating conflicting claims. Instead, I'd prefer that the first
HTTPProxy
to "claim" a particular hostname "wins" and no others may be created for that root hostname.Earlier slack convo about this
current behavior
HTTPProxy
has a root fqdn ofinventory.example.com
and then someone writes a differentHTTPProxy
for the same fqdn, then it breaks both:--root-namespaces
to restrict the creation of rootHTTPProxy
resources, then namespace admins can't set up their own routes anymore. That prevents the desired "self-service" workflowpossible solution
Admission controller?