cert-manager / aws-privateca-issuer

Addon for cert-manager that issues certificates using AWS ACM PCA.
Apache License 2.0
192 stars 81 forks source link

[Bug]: Error: failed to sts.GetCallerIdentity when using IRSA #277

Closed jicowan closed 1 year ago

jicowan commented 1 year ago

Describe the expected outcome

I am trying to request a certificate from the AWS Private CA. I've configured a subordinate CA to issue client (SSL) certificates. I expect to see an Certificate object in Kubernetes after applying the following:

kind: Certificate
apiVersion: cert-manager.io/v1
metadata:
  name: rsa-cert-2048
  namespace: acm-pca-lab-demo
spec:
  commonName: www.rsa-2048.example.com
  dnsNames:
    - www.rsa-2048.example.com
    - rsa-2048.example.com
  duration: 2160h0m0s
  issuerRef:
    group: awspca.cert-manager.io
    kind: AWSPCAClusterIssuer
    name: demo-test-root-ca
  renewBefore: 360h0m0s
  secretName: rsa-example-cert-2048
  usages:
    - server auth
    - client auth
  privateKey:
    algorithm: "RSA"
    size: 2048

Describe the actual outcome

When I try requesting a certificate, I see the following error in the log:

failed to sts.GetCallerIdentity,"genericissuer":"/demo-test-root-ca","error":"operation error STS: GetCallerIdentity, failed to sign request: failed to retrieve credentials: failed to refresh cached credentials, failed to retrieve credentials, operation error STS: AssumeRoleWithWebIdentity, https response error StatusCode: 403...

I confirmed that my ServiceAccount is referencing the appropriate role ARN and I can see that the IRSA environment variables are being injected into the container for the private CA issuer. I am using public.ecr.aws/k1n1h4h4/cert-manager-aws-privateca-issuer:v1.2.5.

Role trust policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Federated": "arn:aws:iam::<accountID>:oidc-provider/oidc.eks.<region>.amazonaws.com/id/9D5D6851D8B6072929E0E4D984DD9D97"
            },
            "Action": "sts:AssumeRoleWithWebIdentity",
            "Condition": {
                "StringEquals": {
                    "oidc.eks.<region>.amazonaws.com/id/9D5D6851D8B6072929E0E4D984DD9D97:sub": "system:serviceaccount:cert-manager-aws-privateca-issuer:cert-manager",
                    "oidc.eks.<region>.amazonaws.com/id/9D5D6851D8B6072929E0E4D984DD9D97:aud": "sts.amazonaws.com"
                }
            }
        }
    ]
}

Steps to reproduce

Install cert-manager Install private ca issuer Configure IRSA and update service account to reference role ARN Create a AWSPCAClusterIssuer Create a Certificate

Relevant log output

2023-07-21T21:49:58.654138239Z {"level":"error","ts":"2023-07-21T21:49:58Z","logger":"controllers.GenericIssuer","msg":"failed to sts.GetCallerIdentity","genericissuer":"/demo-test-root-ca","error":"operation error STS: GetCallerIdentity, failed to sign request: failed to retrieve credentials: failed to refresh cached credentials, failed to retrieve credentials, operation error STS: AssumeRoleWithWebIdentity, https response error StatusCode: 403, RequestID: 32db6568-15f5-4045-8b68-e7ce4cfe631d, api error AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity","stacktrace":"github.com/cert-manager/aws-privateca-issuer/pkg/controllers.(*GenericIssuerReconciler).Reconcile\n\t/workspace/pkg/controllers/genericissuer_controller.go:92\ngithub.com/cert-manager/aws-privateca-issuer/pkg/controllers.(*AWSPCAClusterIssuerReconciler).Reconcile\n\t/workspace/pkg/controllers/awspcaclusterissuer_controller.go:57\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:235"}
2023-07-21T21:49:58.654179919Z {"level":"error","ts":"2023-07-21T21:49:58Z","msg":"Reconciler error","controller":"awspcaclusterissuer","controllerGroup":"awspca.cert-manager.io","controllerKind":"AWSPCAClusterIssuer","AWSPCAClusterIssuer":{"name":"demo-test-root-ca"},"namespace":"","name":"demo-test-root-ca","reconcileID":"0b844c24-3ce4-4d61-8280-e6be097f70d2","error":"operation error STS: GetCallerIdentity, failed to sign request: failed to retrieve credentials: failed to refresh cached credentials, failed to retrieve credentials, operation error STS: AssumeRoleWithWebIdentity, https response error StatusCode: 403, RequestID: 32db6568-15f5-4045-8b68-e7ce4cfe631d, api error AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:329\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:235"}
2023-07-21T21:50:09.040586944Z {"level":"error","ts":"2023-07-21T21:50:09Z","logger":"controllers.GenericIssuer","msg":"failed to sts.GetCallerIdentity","genericissuer":"/demo-test-root-ca","error":"operation error STS: GetCallerIdentity, failed to sign request: failed to retrieve credentials: failed to refresh cached credentials, failed to retrieve credentials, operation error STS: AssumeRoleWithWebIdentity, https response error StatusCode: 403, RequestID: 5a05672a-c7bd-4b60-b9d8-828fe934d990, api error AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity","stacktrace":"github.com/cert-manager/aws-privateca-issuer/pkg/controllers.(*GenericIssuerReconciler).Reconcile\n\t/workspace/pkg/controllers/genericissuer_controller.go:92\ngithub.com/cert-manager/aws-privateca-issuer/pkg/controllers.(*AWSPCAClusterIssuerReconciler).Reconcile\n\t/workspace/pkg/controllers/awspcaclusterissuer_controller.go:57\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:235"}


### Version

1.2.5

### Have you tried the following?

- [X] Check the [Troubleshooting section](../#troubleshooting)
- [X] Search open [issues](https://github.com/cert-manager/aws-privateca-issuer/issues)

### Category

Authentication Issue

### Severity

Severity 4
jicowan commented 1 year ago

Similar to issue #40

aveega commented 1 year ago

Hi Jicowan, Thanks for reaching out. Let me check on it by reproducing the issue.

divyansh-gupta commented 1 year ago

There are automated tests with IRSA: https://github.com/cert-manager/aws-privateca-issuer/actions/runs/5596273966/job/15158311416#step:18:1 which passed 3 days ago.

If this turns out to be a reproducible issue, worth taking a look there to see why those tests aren't catching the issue.

jicowan commented 1 year ago

Were you using a subordinate CA? I don't know why that would matter, but maybe...

aveega commented 1 year ago

Hi, On initial look at your role trust policy

"StringEquals": {
                    "oidc.eks.<region>.amazonaws.com/id/9D5D6851D8B6072929E0E4D984DD9D97:sub": "system:serviceaccount:cert-manager-aws-privateca-issuer:cert-manager",
                    "oidc.eks.<region>.amazonaws.com/id/9D5D6851D8B6072929E0E4D984DD9D97:aud": "sts.amazonaws.com"
                }

Doesn't seem to match

"StringEquals": {  
           "${OIDC_URL}:sub": "system:serviceaccount:aws-privateca-issuer:aws-privateca-issuer-sa"  
         }  

Could you change it to the latter and see if that helps.

jicowan commented 1 year ago

i used eksctl to create the role & service account. You think it's formatting the trust policy incorrectly?

divyansh-gupta commented 1 year ago

Mind sharing the eksctl commands? Notice they aren't in the reproduction steps.

jicowan commented 1 year ago
eksctl create iamserviceaccount \
--cluster=fargate-karpenter \
--namespace=cert-manager \
--name=cert-manager-aws-privateca-issuer \
--attach-policy-arn=arn:aws:iam::<account>:policy/AWSPCAIssuerIAMPolicy \
--override-existing-serviceaccounts \
--region <region> \
--approve
jicowan commented 1 year ago

It's failing here https://github.com/cert-manager/aws-privateca-issuer/blob/65bce2a5a64dbc6be0c32b84a3596fc25385c0c0/pkg/controllers/genericissuer_controller.go#L71-L96 but I'm not sure why. The IRSA env variables are being injected. The AWS SDK should use those to authenticate.

jicowan commented 1 year ago

I had the wrong service account name in the trust policy, ugh.

aveega commented 1 year ago

Glad you were able to figure that out.