microsoft / mindaro

Bridge to Kubernetes - for Visual Studio and Visual Studio Code
MIT License
307 stars 106 forks source link

Prevent mass ingress duplication / change cloned ingress hostname #269

Open scp-mb opened 2 years ago

scp-mb commented 2 years ago

Currently we are in the process of moving to a development process where we are using bridge to kubernetes to generate temporary environments from a pull request context, however we have the majority of our platform under one namespace due to some other requirements.

When the routing manager spins up it duplicates every single ingress within the namespace, prepending the name of the temporary environment to the url as a new sub-subdomain, which causes issues as we're using letsencrypt to provide SSL certificates for those ingresses, we end up requesting HUGE amounts of certificates and running into weekly rate limits almost immediately.

So two questions arise.

1) Is there a way to prevent mass ingress duplication and either duplicate only the ingress for application being reviewed without moving everything into separate namespaces

2) Is it possible to change the hostname of the new ingress route, as that would allow us to use a single wildcard certificate for both the regular and review applications, e.g. reviewapp.subdomain.company.com would be reviewapp-subdomain.company.com instead, allowing a single .company.com wildcard cert to be used rather than a certificate with .company.com, .app1.company.com, .app2.company.com etc.

pragyamehta commented 2 years ago

Hi @scp-mb , thanks for opening this issue. Currently, we do not have a way for the user to specify which ingresses to clone or not clone when using Bridge to Kubernetes. This sounds like a great feature to have specially when https is enabled on the cluster. The second ask seems quite specific, so we would need to review it before committing to it. But if we support the first ask (specifying which ingresses to clone or not clone in KubernetesLocalProcessConfig.yaml), would that solve the problem you are facing and dissolve the second ask?

scp-mb commented 2 years ago

Hey, thanks for getting back to me @pragyamehta

The first ask would go a long way towards helping with the problem, but not totally resolve it as there would still need to be a certificate request for the cloned ingress, but requesting one is much better than requesting 20+ per review application that gets spun up. Without workarounds such as using a certificate with a wildcard SAN for each application we would want to review we could still potentially hit the weekly 50 certificate limit during normal development.

The second ask however would prevent any certificate requests completely, as we could just use a single wildcard subdomain certificate for all of the review applications and the normally deployed applications too.

One thing to note however is we aren't using KubernetesLocalProcessConfig.yaml as we aren't running from visual studio (or other IDE) We have an Azure DevOps pipeline set up that can be triggered from a pull request, which then builds a docker image and deploys a helm chart for the review application which sets the routing.visualstudio.io/route-on-header pod annotation, allowing the routing manager to do it's magic.

If I'm wrong and there is a way for KubernetesLocalProcessConfig.yaml to be used that way please let me know, as I've not found any documentation on how to do it.

scp-mb commented 2 years ago

Another potential use case for not duplicating all ingresses that we've run in to is we use two ingress per application, and want to run our review applications under one of the domains but not the other.

For the moment we've issued a new certificate with all the required wildcard subdomains to get around the initial issue for the first domain, but the issue still remains that the nginx ingress controller will continually try to validate the SSL certificate for the second domain in a loop, which it will never find and it causes it to go crazy with cpu usage it seems.

scp-mb commented 2 years ago

@pragyamehta Another issue stemming from duplicating all ingresses...

Our vue.js front end is tenanted with one subdomain per tenant, e.g. tenant1.dev.company.net. We have a wildcard *.dev.company.net SSL certificate for that which works fine. Bridge to kubernetes seems to be smart enough not to duplicate that when making a review app for the front end.

Since the ingress for that is not automatically added, we then add another ingress as part of e deployment when deploying a review application, e.g. review-app-frontend.tenant1.dev.company.net with a SSL certificate for *.tenant1.dev.company.net, this also works fine.

However, if we then deploy another review application into the same namespace (e.g. a backend api) something appears to go haywire, and it attempts to create an ingress for the front end (but from the back end app) with a host of review-app-backend.*.tenant1.dev.company.net. As you can imagine, wildcards in the middle of the ingress does not work at all.

A sanitized copy of the log entry from the routing manager is below:

2022-02-07T15:26:35.7382847Z | RoutingManager | ERROR | CreateNamespacedIngressAsync threw HttpOperationException: StatusCode='UnprocessableEntity', ReasonPhrase='Unprocessable Entity', Content='{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Ingress.extensions \"frontend-review-app\" is invalid: spec.tls[0].hosts[0]: Invalid value: \"review-app-backend.*.tenant1.dev.company.net\": a wildcard DNS-1123 subdomain must start with '*.', followed by a valid DNS subdomain, which must consist of lower case alphanumeric characters, '-' or '.' and end with an alphanumeric character (e.g. '*.example.com', regex used for validation is '\\*\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?(\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')","reason":"Invalid","details":{"name":"frontend-review-app","group":"extensions","kind":"Ingress","causes":[{"reason":"FieldValueInvalid","message":"Invalid value: \"review-app-backend.*.tenant1.dev.company.net\": a wildcard DNS-1123 subdomain must start with '*.', followed by a valid DNS subdomain, which must consist of lower case alphanumeric characters, '-' or '.' and end with an alphanumeric character (e.g. '*.example.com', regex used for validation is '\\*\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?(\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')","field":"spec.tls[0].hosts[0]"}]},"code":422}\n'

amsoedal commented 2 years ago

Hi @scp-mb, thanks for giving us these extra details. I'll bring this up to the team today, will update with the results of our discussion.