kubevirt / cloud-provider-kubevirt

Kubernetes cloud-provider for KubeVirt
Apache License 2.0
73 stars 34 forks source link

kustomize: Add isolated overlay #141

Closed qinqon closed 2 years ago

qinqon commented 2 years ago

The default kustomize overlay at config/overlay/default needs to populate a RoleBinding under kube-system and a ClusterRole and ClusterRoleBinding this mean that some resources have to be populated out of the tenant namespace on infra cluster.

This change add another kustomize overlay config/overlay/isolated that prevent that by disabling external apiserver authentication (/metrics and /configz endpoint will not be valid) and disabling zone and region and tenant nodes.

Also the config/overlay/kubevirtci use as a base config/overlay/isolated to run CI with it.

qinqon commented 2 years ago

@mfranczy Looks like this is the issue, do you know if this can open another can of worms

rhrazdil commented 2 years ago

/lgtm

mfranczy commented 2 years ago

/hold

mfranczy commented 2 years ago

@davidvossel @qinqon @rhrazdil after reading this discussion https://github.com/kubernetes-sigs/cluster-api-provider-kubevirt/pull/168#discussion_r936015164 I think I get your problem better.

The ccmOptions.Authentication.SkipInClusterLookup = true change will not solve your problem. When you run KubeVirt CCM you should pass a kubeconfig (--kubeconfig arg) that points to tenant cluster. Meaning all ccmOptions should influence app behaviour in context of tenat cluster resources, not the infra, and the lookup for config map should take place on the tenant cluster too.

The CCM itself doesn't have any requirement or need to read config maps from the infrastructure cluster. If you see this happening then we must have a bug or cloud-provider pkg itself.

From infrastructure cluster we have to have a possibility to create/get/list/delete/watch services, endpoints, get/list vms and update vms labels (it should be a permission to take those actions in a cluster dedicated namespace).

At Kubermatic once we run KubeVirt CCM we pass a kubeconfig to tenant cluster explicitly and kubeconfig to infra cluster is passed over the kubeconfig field of cloud config (since KubeVirt in that context is a cloud provider). Next, the infra kubeconfig is used to create a controller-runtime client to act on resources mentioned above.

mfranczy commented 2 years ago

Cloud controller manager runs API server so it requires access to extension-apiserver-authentication configmap to verify client requests. I assumed that setting --kubeconfig that points to tenant cluster is enough. I was wrong. After debugging session with @qinqon we learned that we have to set as well --authorization-kubeconfig and --authentication-kubeconfig.

If these two parameters are not set, rest.InClusterConfig will be used to build the client for the API server. So, if you run KubeVirt CCM on infra cluster without these parameters pointing to the tenant cluster, it will try to get the extension authentication config map on the infra cluster.

One of the solution to solve this problem is to set those arguments for KubeVirt CCM deployment

--kubeconfig=<path-to-tenant-kubeconfig>
--authorization-kubeconfig=<path-to-tenant-kubeconfig>
--authentication-kubeconfig=<path-to-tenant-kubeconfig>
qinqon commented 2 years ago

Cloud controller manager runs API server so it requires access to extension-apiserver-authentication configmap to verify client requests. I assumed that setting --kubeconfig that points to tenant cluster is enough. I was wrong. After debugging session with @qinqon we learned that we have to set as well --authorization-kubeconfig and --authentication-kubeconfig.

If these two parameters are not set, rest.InClusterConfig will be used to build the client for the API server. So, if you run KubeVirt CCM on infra cluster without these parameters pointing to the tenant cluster, it will try to get the extension authentication config map on the infra cluster.

One of the solution to solve this problem is to set those arguments for KubeVirt CCM deployment

--kubeconfig=<path-to-tenant-kubeconfig>
--authorization-kubeconfig=<path-to-tenant-kubeconfig>
--authentication-kubeconfig=<path-to-tenant-kubeconfig>

Have rework the PR with what we found so we don't have to "disable" features.

mfranczy commented 2 years ago

Before we merge that PR I would like to go through controller manager API server implementation carefully to double check that it's a right approach. I should have answers soon.

qinqon commented 2 years ago

Before we merge that PR I would like to go through controller manager API server implementation carefully to double check that it's a right approach. I should have answers soon.

Let's keep it hold then.

mfranczy commented 2 years ago

@qinqon I have my answers.

The CCM API expose three endpoints

By passing --authentication-skip-lookup option (the Authentication.SkipInClusterLookup) you eliminate access to extension-apiserver-authentication config map on the KubeVirt cluster. However, that might disable validation of http(s) requests meaning all users will be anonymous for the API server in the CCM.

If you pass --authentication-kubeconfig to the API server of tenant cluster then requests would be validated against the certificate that exists in extension-apiserver-autentication config map on tenant cluster. This is wrong in case you have some applications running on KubeVirt infra cluster that try to reach /configz or /metrics endpoints. Although I understand that's also not your case.

Because community can have different ways of deploying the CCM, I would suggest that we don't remove RBAC to extension-apiserver-authorization from default config. IMO the best would be to add another config that cover your case. What do you think?

qinqon commented 2 years ago

@qinqon I have my answers.

The CCM API expose three endpoints

  • /healthz (doesn't need authentication)
  • /metrics
  • /configz (dumps whole configuration of CCM)

By passing --authentication-skip-lookup option (the Authentication.SkipInClusterLookup) you eliminate access to extension-apiserver-authentication config map on the KubeVirt cluster. However, that might disable validation of http(s) requests meaning all users will be anonymous for the API server in the CCM.

If you pass --authentication-kubeconfig to the API server of tenant cluster then requests would be validated against the certificate that exists in extension-apiserver-autentication config map on tenant cluster. This is wrong in case you have some applications running on KubeVirt infra cluster that try to reach /configz or /metrics endpoints. Although I understand that's also not your case.

Because community can have different ways of deploying the CCM, I would suggest that we don't remove RBAC to extension-apiserver-authorization from default config. IMO the best would be to add another config that cover your case. What do you think?

Thanks for the investigation! So we can still has /healthz with --authentication-skip-lookup=true ? Is this this is the case we can just delegate on how community start the manager to decide how to make it work, at the end the ARGS ar passed to apisever, I think.

mfranczy commented 2 years ago

So we can still has /healthz with --authentication-skip-lookup=true

yes, because healthz is on the allowed list as described (you can extend the list):

// This allows the kubelet to always get health and readiness without causing an authorization check. // This field can be cleared by callers if they don't want this behavior. AlwaysAllowPaths: []string{"/healthz", "/readyz", "/livez"},

Is this this is the case we can just delegate on how community start the manager to decide how to make it work,

After reading the code I think community does that well. It's just unusual scenario where extension API runs on a cluster but it's not allowed to read client-ca-file certificate to validate requests because of the isolation. Maybe you can pass somehow the certificate to the tenant namespace (I didn't check that). Although that would mean copying/generating the certifcate on each tenant cluster creation. Another solution would be to use different authentication method than client certificate authentication.

at the end the ARGS ar passed to apisever

yes, the CompletedConfig is passed to the API server options of CCM.

qinqon commented 2 years ago

So we can still has /healthz with --authentication-skip-lookup=true

yes, because healthz is on the allowed list as described (you can extend the list):

// This allows the kubelet to always get health and readiness without causing an authorization check. // This field can be cleared by callers if they don't want this behavior. AlwaysAllowPaths: []string{"/healthz", "/readyz", "/livez"},

Is this this is the case we can just delegate on how community start the manager to decide how to make it work,

After reading the code I think community does that well. It's just unusual scenario where extension API runs on a cluster but it's not allowed to read client-ca-file certificate to validate requests because of the isolation. Maybe you can pass somehow the certificate to the tenant namespace (I didn't check that). Although that would mean copying/generating the certifcate on each tenant cluster creation. Another solution would be to use different authentication method than client certificate authentication.

at the end the ARGS ar passed to apisever

yes, the CompletedConfig is passed to the API server options of CCM.

I am going to change this PR to add another kustomize overlay that will remove the kube-system role binding and add the --authentication-skip-lookup=true flag to deployment so the community can decide how to deploy, the golang code will not be changed.

qinqon commented 2 years ago

/cc @nirarg

kubevirt-bot commented 2 years ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mfranczy

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/kubevirt/cloud-provider-kubevirt/blob/main/OWNERS)~~ [mfranczy] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
mfranczy commented 2 years ago

/hold cancel