Open teemow opened 2 years ago
@alex-dabija this came up in a call with @mnitchev. I guess it is a bit too late to change the GCP plans here. But still I wanted to write this down to discuss this for the future. It is not urgent if there is no current pain.
These operators have 2 responsibilities at the moment:
external-dns
, but because it's just one record we decided to just do it in each operator for now;We agreed in the KaaS Sync that the DNS needs to be redone but we decided to keep things simple for now and to have the separate operators. It's one of the leftovers from the CAPI Hive Sprint.
Yes, I agree at some point we need to clean up our DNS configuration.
This includes the delegation from the customer's hosted zone to the workload cluster's hosted zone. This still needs to be implemented. it can be a single controller supporting multiple DNS providers or a controller per DNS provider;
Alright I forgot about the hosted zone. It is not the customer hosted zone but the hosted zone that we use in the management cluster and delegate it from there. Afaik we are mainly doing this to allow access to the zone within the workload cluster account. So this has security implications.
Good question how this will work if we decide to create a flat DNS hierarchy. At least with the new approach we would lose that each management cluster and each workload cluster are scoped and a hacked management or workload cluster can't change DNS of another management or workload cluster.
My understanding of DNS flat structure is slightly different than the proposal in the RFC: https://github.com/giantswarm/rfc/pull/32/files#r849160334 .
Personally, I'm in favor of having a hosted zone per customer and cloud provider in order to avoid the complications of doing cross cloud DNS. This would still allow us to promote a workload cluster to a management cluster for the same customer on the same cloud provider. It's also aligned with our current approach and we can still use UUIDs within the customer's hosted zone.
I just realized that the current implementation is not align with the flat DNS structure that I was talking about (my bad) and we could probably simplify the DNS configuration if we have a hosted zone per customer and cloud provider, because:
external-dns
to configure the kube-api
DNS record;I do think we need a hosted zone per workload cluster or how would you scope the permissions on the workload cluster side? A hacked workload cluster should not be able to change any DNS of another workload cluster or the management cluster.
I do think we need a hosted zone per workload cluster or how would you scope the permissions on the workload cluster side? A hacked workload cluster should not be able to change any DNS of another workload cluster or the management cluster.
Yes, you're right. I was only thinking from the point of view of simplifying things.
@giantswarm/team-cabbage we need to think about this as whole also from the ingress wildcard perspective. @bavarianbidi started a thread in slack: https://gigantic.slack.com/archives/C01F7T2MNRL/p1676300937018959
We in Cabbage will take this into refinement and will come up with a basic idea how to align the DNS setup
Gist from the slack thread: Deprecate custom operators and replace with using external-dns.
external-dns is not able to set up zones and also its role has access to its clusters subdomain.
In order to create a wildcard record, a external-dns CR needs to be created.
One possibility would be to include a external-dns CR in the ingress controllers helm charts.
@Gacko will do some more additional discovery
Is the API server LoadBalancer
created by an operator? Because I feel like external-dns
doesn't need the API server LoadBalancer
but other components of a cluster might do so and so the cluster won't come up without the DNS record and so external-dns
might also not come up.
So in the end all we can do is moving the wildcard ingress DNS record to either an external-dns
CR or maybe the ingress LoadBalancer
service annotation (both needs to be tested). This would result in the wildcard ingress DNS record getting created upon ingress controller installation instead of beforehand / cluster rollout.
opsctl
(scope: cluster lifecycle)For a smooth integration in opsctl
we need the following DNS Records per workload cluster (in that case, management cluster = workload cluster)
bastion1.<myClusterName>.<someKindOfBaseDomain>.gigantic.io
(e.g. bastion1.d5fg6.test.gigantic.io
) pointing to a routable bastion host IP from our VPN hosts
opsctl ssh
workapi.<myClusterName>.<someKindOfBaseDomain>.gigantic.io
(e.g. api.d5fg6.test.gigantic.io
) pointing to a routable k8s API IP from our VPN hosts
opsctl login --method=clientcert
workopsctl login --method=sso
/opsctl login
requires advanced AuthN/AuthZ setup from belowNote:
CAPI
doesn't depend on the Giant Swarm specific DNS record
For the kubernetes API OIDC-integration, dex
is running within the management cluster. For that the k8s API-Server is configured with https://dex.<myClusterName>.<someKindOfBaseDomain>.gigantic.io
(e.g. dex.d5fg6.test.gigantic.io
)
As wildcard CNAME
record got created which points to ingress.<myClusterName>.<someKindOfBaseDomain>.gigantic.io
and
ingress.<myClusterName>.<someKindOfBaseDomain>.gigantic.io
is an A
record which points to the LoadBalancer
IP of svc/nginx-ingress-controller-app
in namespace/kube-system
, requests to dex.<myClusterName>.<someKindOfBaseDomain>.gigantic.io
will appear on the nginx-ingress-controller
.
For dex
and login
an ingress
object already exist.
To make the webUIs of prometheus
, grafana
and alertmanager
accessible, prometheus.<myClusterName>.<someKindOfBaseDomain>.gigantic.io
, grafana.<myClusterName>.<someKindOfBaseDomain>.gigantic.io
and alertmanager.<myClusterName>.<someKindOfBaseDomain>.gigantic.io
must be resolvable.
This is currently done by having the wildcard CNAME
in place and requests are handled by nginx-ingress-controller
as ingress
objects for these three targets exists.
To make the happa
webUI and happa
API accessible, happa.<myClusterName>.<someKindOfBaseDomain>.gigantic.io
and happaapi.<myClusterName>.<someKindOfBaseDomain>.gigantic.io
must be resolvable.
This is currently done by having the wildcard CNAME
in place and requests are handled by nginx-ingress-controller
as ingress
objects for these three targets exists.
Same from above applies for athena.<myClusterName>.<someKindOfBaseDomain>.gigantic.io
as well.
not 100% sure for what it's used
All dns-operators
are applied on the management cluster and are reconciling the cluster
CRs and the infracluster
specific CRs (some of them are doing more advanced stuff but in general the cluster
and infracluster
CRs are common)
dns-operator-azure
mycluster.azuretest.gigantic.io
in resource group mycluster
)mycluster.azuretest.gigantic.io
)mycluster.azuretest.gigantic.io
)dns-operator-route53
(used in CAPVCD
)route53
(e.g. mycluster.test.gigantic.io
)route53
for zone delegationapi.mycluster.test.gigantic.io
)bastion1.mycluster.test.gigantic.io
)ingress
A
record and the wildcard CNAME
if a svc/nginx-ingress-controller
with type LoadBalancer
exist in namespace/kube-system
.dns-operator-aws
to be documented but i guess the feature set is more or less the same as in
dns-operator-route53
dns-operator-gcp
to be documented but i guess the feature set is more or less the same as in
dns-operator-route53
The idea is having external-dns
and dns-operator-*
in a combination.
dns-operator-*
dns-operator
takes care of the DNS zone creation per cluster dns-operator
also takes care of the zone delegation to make the entire DNS chain workdns-operator
create A
records for api
and bastion1
external-dns
As all accessible endpoints are already defined with a valid FQDN in the corresponding ingress
object and for some of them we use some kind of ACME
services to get certificates we could use external-dns
to create A
records by reconciling on an ingress
base.
This is currently possible as external-dns
get's the external IP from the ingress.status
field and create an A
record for the defined ingress.spec.rules.[].host
.
as there are currently some discussions for defining desired DNS records
via an additional CRD
for external-dns
it could be possible in the future to have a small controller which just creates these new (non existing yet) CRs
by reconciling CAPI CRs and external-dns
takes care of the DNS record creation.
The only missing part might be the DNS zone creation and DNS zone delegation.
After a short discussion with @Gacko we came to the following conclusion:
dns-operator*
for the cluster specific zone creationdns-operator*
for the zone delegationdns-operator*
for A records creation of api
and bastion1
external-dns
in CAP*
management clusters to create A
records for all ingress/services on top of the management clusterTODOs:
external-dns
external-dns
to reconcile ingress
objects in all namespaces for CAP*
based clustershttp://giantswarm.io/external-dns
annotation to MC relevant ingress
objects (happaapi
, happa
, athena
, dex
, login
, prometheus
, grafana
, alertmanager
)ingress
A record + wildcard
CNAME handling from the CAPI based dns-operators
external-dns
to make a default deployment with an opt-in wildcard
record work (mainly because our exposing-workload example implies a wildcard CNAME
which point to the cluster specific ingress)cc: @alex-dabija / @teemow
Regarding your last point:
prepare external-dns to make a default deployment with an opt-in wildcard record work (mainly because our exposing-workload example implies a wildcard CNAME which point to the cluster specific ingress)
I don't think this is part of external-dns
but rather the specific ingress controller. We're setting an annotation on the Service
object of the installed ingress controller containing the desired FQDN, which currently is ingress.CLUSTER_DOMAIN
. external-dns
reconciles this and creates the according DNS entry.
In the future we would need to set *.CLUSTER_DOMAIN
instead to make the simple "out-of-the-box" / "catch-all" solution work.
You possibly mean the same, but I wasn't sure. :)
In the future we would need to set *.CLUSTER_DOMAIN instead to make the simple "out-of-the-box" / "catch-all" solution work.
independent of the implementation to make the simple "out-of-the-box" / "catch-all" solution work, something has to be done. So yes i mean the same just wanted to make sure we don't loose/track the "out-of-the-box-for-hello-world-demo-stuff" feature.
In the current pre-alpha
version of a CAPZ
based management cluster we've got external-dns
up and running.
All required DNS-Entries for components running "on top of kubernetes" (no bastion host and api server record) got created by adding the external-dns
required annotation on all relevant ingress
objects.
external-dns
change: https://github.com/giantswarm/external-dns-app/pull/240
add external-dns
annotation to relevant ingress
objects:
config
repo: https://github.com/giantswarm/config/pull/1605@alex-dabija / @teemow how should we proceed with other CAPI provider implementations? cc @cornelius-keller / @gawertm
I am not sure moving away from the wildcard record is the best, customers like adidas on deu01 have close to 4000 records, where each of them would now become a new DNS entry, not sure externalDNS can handle this many entries.
They probably could still create this wildcard record via an ExternalDNS Endpoint
CR, but yeah, you're right. I'd at least remove the creation of the ingress.<domain>
record from dns-operator-*
s.
@Gacko ingress.<domain>
record is created by external-dns.
We believe it is important to keep the wildcard record, we will discuss this in kaas-sync
@whites11 Yes, but also by dns-operator-route53
. That's why I filed a PR to remove it there.
Sorry for being late for the party. I noticed @Gacko's PR for dns-operator-route53 and I want to understand what we want to do.
CAPV and CAPVCD status
dns-operator-route53
creates ingress.<cluster-name>.<base-domain>
for all clusters (MC itself and WCs if nginx-ingress exists in the WC). We configure it once via config repository in mc-bootstrap repo.
My understanding about what we want (Based on azure repositories)
external-dns
should be part of default-apps-<provider>
so we deploy it to every cluster. For each cluster, the user has to provide credentials for external-dns
. external-dns
will be responsible for only ingress.<cluster-name>.<base-domain>
. The remaining ones ( api, wildcard, etc) will be created by dns-operator-<whatever>
Is my understanding right?
Yes, your understanding is right. Additionally external-dns
can be used to reconcile more services and/or different FQDN than just ingress.<cluster-name>.<base-domain>
, e.g. ingress-internal.<cluster-name>.<base-domain>
or also hostnames in Ingress
resources.
At best even the wildcard is being dropped from the operator and customers would instead set single hostnames via their ingresses, but that's what @paurosello pointed out.
FYI @JosephSalisbury
Is there a clear decision about this? As far as I recall, we have discussed it in KaaS sync but couldn't arrive at any conclusion.
Is there a clear decision about this? As far as I recall, we have discussed it in KaaS sync but couldn't arrive at any conclusion.
No, there's no decision on this yet. In KaaS Sync we agreed to have a followup meeting to figure out where we are and what are the next steps. I just scheduled the meeting for next Monday at 13:30 CET.
Notes from our meeting:
sounds good to me, removing Rocket for now to keep backlog clean. Once its picked up by Turtles again, happy to take some alignment tasks to rocket as well
At the moment we've created separate (copied) DNS operators for each and every provider. We reconcile the CAPI cluster information and then add the corresponding dns entries in route53, cloudflare etc.
So I was wondering how we can make this translation thinner. Afaik with
external-dns
you can define DNS entries with annotations for services and ingresses. We can define services and ingress for all components that we need to create DNS names for. Can't we?Would it make sense to use
external-dns
as an abstraction of the DNS provider? This way we would also get the flexibilty to have more options for DNS configuration.See: https://github.com/kubernetes-sigs/external-dns#the-latest-release