Closed qinqon closed 2 years ago
@mfranczy Looks like this is the issue, do you know if this can open another can of worms
/lgtm
/hold
@davidvossel @qinqon @rhrazdil after reading this discussion https://github.com/kubernetes-sigs/cluster-api-provider-kubevirt/pull/168#discussion_r936015164 I think I get your problem better.
The ccmOptions.Authentication.SkipInClusterLookup = true
change will not solve your problem.
When you run KubeVirt CCM you should pass a kubeconfig (--kubeconfig
arg) that points to tenant cluster.
Meaning all ccmOptions
should influence app behaviour in context of tenat cluster resources, not the infra,
and the lookup for config map should take place on the tenant cluster too.
The CCM itself doesn't have any requirement or need to read config maps from the infrastructure cluster. If you see this happening then we must have a bug or cloud-provider pkg itself.
From infrastructure cluster we have to have a possibility to create/get/list/delete/watch services, endpoints, get/list vms and update vms labels (it should be a permission to take those actions in a cluster dedicated namespace).
At Kubermatic once we run KubeVirt CCM we pass a kubeconfig to tenant cluster explicitly and kubeconfig to infra cluster is passed over the kubeconfig
field of cloud config (since KubeVirt in that context is a cloud provider). Next, the infra kubeconfig is used to create a controller-runtime client to act on resources mentioned above.
Cloud controller manager runs API server so it requires access to extension-apiserver-authentication
configmap to verify client requests. I assumed that setting --kubeconfig
that points to tenant cluster is enough. I was wrong. After debugging session with @qinqon we learned that we have to set as well --authorization-kubeconfig
and --authentication-kubeconfig
.
If these two parameters are not set, rest.InClusterConfig
will be used to build the client for the API server. So, if you run KubeVirt CCM on infra cluster without these parameters pointing to the tenant cluster, it will try to get the extension authentication config map on the infra cluster.
One of the solution to solve this problem is to set those arguments for KubeVirt CCM deployment
--kubeconfig=<path-to-tenant-kubeconfig>
--authorization-kubeconfig=<path-to-tenant-kubeconfig>
--authentication-kubeconfig=<path-to-tenant-kubeconfig>
Cloud controller manager runs API server so it requires access to
extension-apiserver-authentication
configmap to verify client requests. I assumed that setting--kubeconfig
that points to tenant cluster is enough. I was wrong. After debugging session with @qinqon we learned that we have to set as well--authorization-kubeconfig
and--authentication-kubeconfig
.If these two parameters are not set,
rest.InClusterConfig
will be used to build the client for the API server. So, if you run KubeVirt CCM on infra cluster without these parameters pointing to the tenant cluster, it will try to get the extension authentication config map on the infra cluster.One of the solution to solve this problem is to set those arguments for KubeVirt CCM deployment
--kubeconfig=<path-to-tenant-kubeconfig> --authorization-kubeconfig=<path-to-tenant-kubeconfig> --authentication-kubeconfig=<path-to-tenant-kubeconfig>
Have rework the PR with what we found so we don't have to "disable" features.
Before we merge that PR I would like to go through controller manager API server implementation carefully to double check that it's a right approach. I should have answers soon.
Before we merge that PR I would like to go through controller manager API server implementation carefully to double check that it's a right approach. I should have answers soon.
Let's keep it hold then.
@qinqon I have my answers.
The CCM API expose three endpoints
By passing --authentication-skip-lookup
option (the Authentication.SkipInClusterLookup
) you eliminate access to extension-apiserver-authentication
config map on the KubeVirt cluster. However, that might disable validation of http(s) requests meaning all users will be anonymous for the API server in the CCM.
If you pass --authentication-kubeconfig
to the API server of tenant cluster then requests would be validated against the certificate that exists in extension-apiserver-autentication
config map on tenant cluster. This is wrong in case you have some applications running on KubeVirt infra cluster that try to reach /configz
or /metrics
endpoints. Although I understand that's also not your case.
Because community can have different ways of deploying the CCM, I would suggest that we don't remove RBAC to extension-apiserver-authorization
from default config. IMO the best would be to add another config that cover your case. What do you think?
@qinqon I have my answers.
The CCM API expose three endpoints
- /healthz (doesn't need authentication)
- /metrics
- /configz (dumps whole configuration of CCM)
By passing
--authentication-skip-lookup
option (theAuthentication.SkipInClusterLookup
) you eliminate access toextension-apiserver-authentication
config map on the KubeVirt cluster. However, that might disable validation of http(s) requests meaning all users will be anonymous for the API server in the CCM.If you pass
--authentication-kubeconfig
to the API server of tenant cluster then requests would be validated against the certificate that exists inextension-apiserver-autentication
config map on tenant cluster. This is wrong in case you have some applications running on KubeVirt infra cluster that try to reach/configz
or/metrics
endpoints. Although I understand that's also not your case.Because community can have different ways of deploying the CCM, I would suggest that we don't remove RBAC to
extension-apiserver-authorization
from default config. IMO the best would be to add another config that cover your case. What do you think?
Thanks for the investigation! So we can still has /healthz with --authentication-skip-lookup=true ? Is this this is the case we can just delegate on how community start the manager to decide how to make it work, at the end the ARGS ar passed to apisever, I think.
So we can still has /healthz with --authentication-skip-lookup=true
yes, because healthz is on the allowed list as described (you can extend the list):
// This allows the kubelet to always get health and readiness without causing an authorization check. // This field can be cleared by callers if they don't want this behavior. AlwaysAllowPaths: []string{"/healthz", "/readyz", "/livez"},
Is this this is the case we can just delegate on how community start the manager to decide how to make it work,
After reading the code I think community does that well. It's just unusual scenario where extension API runs on a cluster but it's not allowed to read client-ca-file certificate to validate requests because of the isolation. Maybe you can pass somehow the certificate to the tenant namespace (I didn't check that). Although that would mean copying/generating the certifcate on each tenant cluster creation. Another solution would be to use different authentication method than client certificate authentication.
at the end the ARGS ar passed to apisever
yes, the CompletedConfig
is passed to the API server options of CCM.
So we can still has /healthz with --authentication-skip-lookup=true
yes, because healthz is on the allowed list as described (you can extend the list):
// This allows the kubelet to always get health and readiness without causing an authorization check. // This field can be cleared by callers if they don't want this behavior. AlwaysAllowPaths: []string{"/healthz", "/readyz", "/livez"},
Is this this is the case we can just delegate on how community start the manager to decide how to make it work,
After reading the code I think community does that well. It's just unusual scenario where extension API runs on a cluster but it's not allowed to read client-ca-file certificate to validate requests because of the isolation. Maybe you can pass somehow the certificate to the tenant namespace (I didn't check that). Although that would mean copying/generating the certifcate on each tenant cluster creation. Another solution would be to use different authentication method than client certificate authentication.
at the end the ARGS ar passed to apisever
yes, the
CompletedConfig
is passed to the API server options of CCM.
I am going to change this PR to add another kustomize overlay that will remove the kube-system role binding and add the --authentication-skip-lookup=true flag to deployment so the community can decide how to deploy, the golang code will not be changed.
/cc @nirarg
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: mfranczy
The full list of commands accepted by this bot can be found here.
The pull request process is described here
/hold cancel
The default kustomize overlay at
config/overlay/default
needs to populate a RoleBinding under kube-system and a ClusterRole and ClusterRoleBinding this mean that some resources have to be populated out of the tenant namespace on infra cluster.This change add another kustomize overlay
config/overlay/isolated
that prevent that by disabling external apiserver authentication (/metrics and /configz endpoint will not be valid) and disabling zone and region and tenant nodes.Also the
config/overlay/kubevirtci
use as a baseconfig/overlay/isolated
to run CI with it.