Closed thefirstofthe300 closed 1 month ago
I managed to POC using the built-in well-known OID configuration endpoint in the API server (https://github.com/kubernetes/kubernetes/pull/98553); however, it currently requires bringing my own infra and knowing the CA cert's fingerprint. Here's my configuration:
/.well-known/openid-configuration
endpoint kubectl create clusterrolebinding oidc-reviewer --clusterrole=system:service-account-issuer-discovery --group=system:unauthenticate
Unfortunately, I don't really have time to take a stab at contributing this feature to CAPA right now or I'd try to add this to CAPA myself.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
I just had to spec this out and will code it into CAPA shortly. I'll add my approach and I would like some more eyes on it before I spend too much time coding. First, I'm basing more of my knowledge on how to even get this working on an example written by @smalltown located here: https://github.com/smalltown/aws-irsa-example and accompanying blog article: https://medium.com/codex/irsa-implementation-in-kops-managed-kubernetes-cluster-18cef84960b6
Approach
1) Users will have to create an iaas cluster with bucket management turned on, this is a feature used in capa to manage an s3 bucket for ignition userdata and we will reuse this existing functionality to create/delete a bucket per cluster.
2) Make associateOidcProvider: true
work for iaas clusters, it will use cert manager to gen certs and add keys.json
and .well-know/openid-configuration
and make them public read and then create the oidc provider in IAM and generate a trust policy and store it in a config map which you can use later in your roles which is idtentical to how managed clusters work.
3) Install https://github.com/aws/amazon-eks-pod-identity-webhook by taking the manifests and moving them into ./config
, modify it to use a cert from cert-manager and have capa deploy it to the cluster.
4) As per expectations in EKS' implementation, capa will not build the irsa roles for you. The user is expected to use the trust policy located in a configmap and do this themselves. The user will create the role with the trust policies they need for their service and add the contents from this configmap to make a working role. The user is also expected to take the arn from that role and use it as an annotation eks.amazonaws.com/role-arn
when they make a service account and assign it to a pod.
@thefirstofthe300 please read the above and see if you agree with the approach, while I like your idea of using the ServiceAccountIssuerDiscovery but I feel having capa spin up any extra infra is a bit too much to ask of users just to get IRSA support when the s3 buckets are already being used/managed by capa and have free public URLs. The only pieces capa would need to manage are some certs, iam oidc provider, webhook and the contents of the files in s3.
Either way will work. I think the main question comes down to how certs should work.
In essence, the way I went about creating things is identical to the way CAPA creates it's infrastructure; however, instead of generating everything using port 6443 for the API server ELB serving port, it uses port 443. I'd think providing the ability to set the serving port to 443 over 6443 would be relatively simple in the CAPA code and remove the need for the BYO infra.
The two things I've had to shove into my configs are adding the root ca cert to the API serving cert chain using Ignition and adding the cluster binding to allow unauthenticated users to view the openid config.
Doing this removes the need to rely on S3 to serve the configuration.
The main piece that I have a question on is whether people want to serve the root cert with their serving chain. Ultimately, I don't think it opens any security holes as the certificates themselves are basically a public key and no change is made to the TLS private key.
I'd be interested in hearing some input from someone more in tune with the security side.
How much of this can we solve with documenation vs. code and cluster/infrastructure component template changes?
/triage needs-information
@dlipovetsky fair question but all the pieces together make this a complex feature as it mixes aws resources (s3, oidc provider) and kube resources (cert, deployment) and the one thing which knows about all those things is capa and can only be built at cluster creation time. When I get this written we will have feature parity between iaas and eks for IRSA and i feel it's only fair that capa does the work to make feature parity between the two cluster types
@dlipovetsky I think I agree with Luther. I don't believe a document change can handle this. The IRSA logic requires a fair amount of work between the pods, assigning things, creating connections, adding the webhook, generating certificates etc.
Honestly, this is quite a large piece of work. @luthermonson and @thefirstofthe300, do you think we could break this down into several more minor issues and use this issue as an epic to track each piece?
Such as creating the configuration, supporting 443, adding the webhook installation process, etc.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
/triage accepted
This issue has not been updated in over 1 year, and should be re-triaged.
You can:
/triage accepted
(org members only)/close
For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/
/remove-triage accepted
This issue is currently awaiting triage.
If CAPA/CAPI contributors determines this is a relevant issue, they will accept it by applying the triage/accepted
label and provide further guidance.
The triage/accepted
label can be added by org members by writing /triage accepted
in a comment.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/reopen
/remove-lifecycle rotten
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
/kind feature
IRSA appears to have been complete for EKS clusters in #2054. My company currently uses non-EKS clusters in combination with IRSA. I'd like the ability to install IRSA using the provider. The main piece that is tricky to coordinate is the certificate authority used to sign and validate the JWTs. If these pieces could be automated as part of the ignition config (we also use Flatcar), our burden of installation/maintenance would be greatly decreased.
Environment:
kubectl version
): v1.24/etc/os-release
): Flatcar Container Linux by Kinvolk 3139.2.3 (Oklo)