Closed fabriziopandini closed 2 years ago
This connection goes through the external load balancer, and this is not ideal in the case of self-hosted clusters given that we can use "in cluster" connectivity to avoid unnecessary network hoops.
Is the assumption correct (across providers) that the client credentials stored in <cluster>-kubeconfig
are also accepted by the apiserver when connecting via kubernetes.default.svc
?
(Just asking because I'm not sure what kind of authn solutions folks have build with Kubernetes, especially for managed control planes)
+1. Some clusters cant even talk directly to the external load balancer. azure lb's of type internal cant be reached from the same node.
+1. Some clusters cant even talk directly to the external load balancer. azure lb's of type internal cant be reached from the same node.
I think that should be fine. We would use kubernetes.default.svc
as a server URL. This should work everywhere because otherwise our controllers wouldn't work (it's used by the in cluster config we use for our mgmt cluster client).
I was specifically asking about the client credentials that we would take from the -kubeconfig secret
@sbueringer, yeah, was speaking more generally to the raised issue. Identifying the cluster is self hosted would be important to self managing an azure cluster for example, as it has that internal lb restriction.
We should start collecting possible ways we can identify a cluster being a self-managed one (management+workload+self)
/priority important-soon
Feels like there is a chicken and the egg problem here... but if its just an optimization, then the controller could check for its own pod in the "remote" cluster. Then it would know its within the same cluster. If it has to be established before you can talk to the "remote" cluster, maybe its better left as a driver specific implementation?
Some context:
ClusterCacheTracker
. ClusterCacheTracker
to automatically detect if the workload cluster is the current cluster the controller is running on and then generate the Client accordingly.Just to start with some ideas on how to identify that:
+1 this would be a good improvement
I'll take a look at your POC @sbueringer, thanks for starting it
Another alternative idea would be potentially to check if the remote cluster has a Lease that matches the leader election information we have for the controller?
Another alternative idea would be potentially to check if the remote cluster has a Lease that matches the leader election information we have for the controller?
As far as I can tell:
LeaderElectionID
into ClusterCacheTracker
LeaderElectionNamespace
the same way as CR or calculate it in main.go
and hand it over to Manager
& ClusterCacheTracker
With that information we are able to retrieve the lease.
Lease looks ~ like this:
apiVersion: coordination.k8s.io/v1
kind: Lease
metadata:
name: controller-leader-election-capi
namespace: capi-system
spec:
acquireTime: "2022-07-08T05:29:52.759035Z"
holderIdentity: sbuerin_a4cbb424-9d8c-469b-adc0-e017ad05047f
leaseDurationSeconds: 15
leaseTransitions: 0
renewTime: "2022-07-08T05:30:32.871202Z"
The holderIdentity
is calculated in NewResourceLock
roughly via "hostname_" + uuid.NewUUID()
.
It is not accessible for us after it has been calculated nor can it be set via NewManager
today. I think this requires a change in CR and the most realistic one is to add a new field to ctrl.Options
called LeaderElectionIdentity
that we can calculate in main.go
and then pass into the Manager
and the ClusterCacheTracker
.
I think this should work, but it will of course only work if the leader election is enabled. We already have cases where we disable it for debugging purposes like Tilt (otherwise the controller crashes if we wait to long at a break point during debugging). So it could make it harder to be able to live-debug this case.
Checking for if the current pod exists in the "workload cluster" seems simpler to me, mostly because it doesn't introduce a dependency to leader election.
@sbueringer should we proceed with the above solution proposed in v1.2? We can start by having it behind a feature gate?
Given that it mitigates the issue(s) mentioned above it sounds fine to me to include it in >=v1.2.1 behind a feature gate.
(edited)
I'm +1 to check for if the current pod exists in the "workload cluster" because it is simple and easily back-portable. Then we can eventually consider more complex/elegant solution in a follow-up iteration.
TBH, I don't think that we need a feature flag for this behavior because If I got the WIP right if one of POD_NAMESPACE, POD_NAME, POD_UID are empty (or we drop those variables), it reverts to old behaviour
/assign @sbueringer
CAPI controllers use the
-kubeconfig to connect to the workload cluster to inspect nodes, etcd etc. etc. This connection goes through the external load balancer, and this is not ideal in the case of self-hosted clusters given that we can use "in cluster" connectivity to avoid unnecessary network hoops.
I don't think it's really implicit that it uses external load balancer but rather whatever the known secret might contain, which is dictated by any given control plane implementation. Which then is consumed by the controllers via a "weak" contract based on a name assumption. Makes sense?
Instead of coding in terms of self-hosted vs not as in this PR we could angle this in a way we provide the appropriate contractual primitives to enable consumers to expand their use cases. I can think of several personas that would be consumers for different kubeconfigs: service-network-kubeconfig (internal components) localhost-kubeconfig (kube api server sidecars) admin-kubeconfig (external cluster consumers)
Is up to the CP implementation generate those and that's the contract for consumers, e.g workload cluster controllers (machine controller) can use .status.serviceKubeconfig.service-network-kubeconfig/admin-kubeconfig.
Thoughts?
@enxebre your suggestion seems like a good improvement but I don't think I understand how it would solve the bug described in the original issue. In the proposed fix PR only changes the KubeadmControlPlane controller, not the cluster controller so that KCP preflight checks don't fail when the generated kubeconfig has an internal LB apiserver address and the KCP controller itself is running on one of the control plane pods. Are you saying that KCP would need to change the kubeconfig it generates for this particular case?
Is up to the CP implementation generate those and that's the contract for consumers, e.g workload cluster controllers (machine controller)
I guess I'm confused by this statement since in the case of the problem we're trying to solve the CP implementation both generates and uses the kubeconfig to connect to the workload cluster
so that KCP preflight checks don't fail when the generated kubeconfig has an internal LB apiserver address and the KCP controller itself is running on one of the control plane pods.
Ah I see, I missed that. Makes sense. Thanks @CecileRobertMichon
Are you saying that KCP would need to change the kubeconfig it generates for this particular case?
yeh kind of, I'm suggesting long term we might want think about a model/contract letting multiple kubeconfigs scoped to different purposes to be created and so them be consumed as they best fit. But you're right for this particular case doesn't seem to help much.
fwiw the current PR lgtm I have no objections.
yeh kind of, I'm suggesting long term we might want think about a model/contract letting multiple kubeconfigs scoped to different purposes to be created and so them be consumed as they best fit. But you're right for this particular case doesn't seem to help much.
@enxebre Makes sense. I think we can continue this part of the discussion as part of https://github.com/kubernetes-sigs/cluster-api/issues/3661?
User Story
As a developer, I would like to have a way to detect if a Cluster is self-hosted so I can use this info to improve connectivity from CAPI controllers
Detailed Description
CAPI controllers use the
<cluster>-kubeconfig
to connect to the workload cluster to inspect nodes, etcd etc. etc.This connection goes through the external load balancer, and this is not ideal in the case of self-hosted clusters given that we can use "in cluster" connectivity to avoid unnecessary network hoops.
However, from a CAPI code PoV there is no well-defined way to detect "this cluster is a management cluster" or "this cluster is a self-hosted cluster".
This issue is about discussing options to provide this capability.
Anything else you would like to add:
This could be relevant for a couple of issues
/kind feature