vmware-tanzu / pinniped

Pinniped is the easy, secure way to log in to your Kubernetes clusters.
https://pinniped.dev
Apache License 2.0
541 stars 65 forks source link

TokenCredentialRequestAPI related errors when ImpersonationProxy is being used instead #1920

Closed KihyeokK closed 4 months ago

KihyeokK commented 4 months ago

What happened? I am running pinniped-concierge v0.28.0 image, using the helm chart https://github.com/vmware-tanzu/pinniped/releases/download/v0.20.0/install-pinniped-concierge.yaml on GKE 1.28. If I understand things correctly, ImpersonationProxy should be being used instead of TokenCredentialRequestAPI, as I can't run any custom pod on the same control plane node running kube-controller-manager, and the helm chart sets spec.impersonationProxy.mode as auto by default, as mentioned in the docs here. However, I am getting error logs from concierge that seem to be related to the use of TokenCredentialRequestAPI like the following

{"level":"error","timestamp":"2024-04-19T19:40:28.363250Z","caller":"k8s.io/apiserver@v0.28.4/pkg/server/dynamiccertificates/tlsconfig.go:275$dynamiccertificates.(*DynamicServingCertificateController).processNextWorkItem","message":"key failed with : not loading an empty serving certificate from \"concierge-serving-cert\""}

{"level":"error","timestamp":"2024-04-19T21:43:28.896540Z","caller":"go.pinniped.dev/internal/controllerlib/controller.go:222$controllerlib.(*controller).handleKey","message":"kube-cert-agent-controller: { } failed with: could not find a healthy kube-controller-manager pod (0 candidates)"}

There was also an error log about: "tls: failed to verify certificate: x509: certificate signed by unknown authority"

Are these error logs supposed to be there when Impersonation Proxy is being used instead of TokenCredentialRequestAPI? Is there a way to disable these logs when Impersonation Proxy is being used?

Thank you!

What did you expect to happen?

Errors mentioned above are not shown when Impersonation Proxy is being used instead of TokenCredentialRequestAPI.

What is the simplest way to reproduce this behavior?

Running helm chart https://github.com/vmware-tanzu/pinniped/releases/download/v0.20.0/install-pinniped-concierge.yaml on GKE 1.28.

In what environment did you see this bug?

What else is there to know about this bug?

cfryanr commented 4 months ago

Hi @KihyeokK, thanks for creating an issue.

When the impersonation proxy is enabled, clients still use the TokenCredentialRequest API during authentication. The TokenCredentialRequest returns an mTLS client certificate. When you are not using the impersonation proxy, that client cert is signed by the Kubernetes API server. When you are using the impersonation proxy, then that client cert is signed by the impersonation proxy itself. Either way, the client may then submit that mTLS client cert as proof of identity when making calls to Kubernetes APIs (either directly or through the impersonation proxy).

Aside from some potentially confusing log messages, are you having any trouble authenticating or making API calls?

KihyeokK commented 4 months ago

Hi @cfryanr , thank you for the fast response! Aside from the logs, there seems to be no issue at all with interacting with the Kubernetes API server. Also just a note, this error log {"level":"error","timestamp":"2024-04-19T21:43:28.896540Z","caller":"go.pinniped.dev/internal/controllerlib/controller.go:222$controllerlib.(*controller).handleKey","message":"kube-cert-agent-controller: { } failed with: could not find a healthy kube-controller-manager pod (0 candidates)"} was seen in Pinniped v0.20.0 too before upgrading to use the v0.28.0 image and helm chart.

cfryanr commented 4 months ago

That error is part of how auto mode chooses that it should enable the impersonation proxy. It first tries the other strategy (which involves finding the kube-controller-manager pod and then starting a new kube cert agent pod), and only when it sees that the other strategy does not work, then it starts the impersonation proxy.

Sorry that the log messages errors can be confusing. They are valuable for debugging when something goes wrong, but unfortunately they can also be confusing when everything is working exactly as expected.

Shall we close this issue, or did you have any other concerns here?

KihyeokK commented 4 months ago

@cfryanr Thank you for the clarification! I would like to ask just two more questions:

  1. Could I assume that the certificate related error logs like the following are also normal for auto mode and is just a part of the steps of enabling the impersonation proxy?
    
    {"level":"error","timestamp":"2024-04-19T19:40:28.363250Z","caller":"k8s.io/apiserver@v0.28.4/pkg/server/dynamiccertificates/tlsconfig.go:275$dynamiccertificates.(*DynamicServingCertificateController).processNextWorkItem","message":"key failed with : not loading an empty serving certificate from \"concierge-serving-cert\""}

"tls: failed to verify certificate: x509: certificate signed by unknown authority"


2. Would it make more sense to change the log level of the above mentioned `could not find a healthy kube-controller-manager pod (0 candidates)"` log into `"info"` from `"error"` in a future release?
cfryanr commented 4 months ago

For question 1: Yes, this could also be normal if it only happens briefly after installation and then those errors stop happening. That certificate Secret initially does not exist, and very quickly after installation Pinniped should auto-create and auto-populate that certificate Secret.

For question 2: That's a little complicated because of the way that our controller library works (controllers need to return errors when they want to schedule a retry) but perhaps we could find a way to improve it. If we can't find a way to downgrade the error, then we could at least change the text of the error message to make it say that this is normal behavior on cloud provider clusters. I will take a look.

cfryanr commented 4 months ago

Closing this for now because the original purpose of this issue is resolved, but please keep asking questions and making suggestions. Thanks for the discussion!

KihyeokK commented 4 months ago

@cfryanr Thank you for the help!