aws-controllers-k8s / community

AWS Controllers for Kubernetes (ACK) is a project enabling you to manage AWS services from Kubernetes
https://aws-controllers-k8s.github.io/community/
Apache License 2.0
2.4k stars 253 forks source link

Provide usable details on permission errors #1861

Open bra-fsn opened 1 year ago

bra-fsn commented 1 year ago

Is your feature request related to a problem? I'm trying to create an S3 bucket with the S3 controller, but it fails. The controller logs this error:

2023-07-28T09:07:04.538Z    ERROR   Reconciler error    {"controller": "bucket", "controllerGroup": "s3.services.k8s.aws", "controllerKind": "Bucket", "Bucket": {"name":"test-bucket-name","namespace":"cluster-resources"}, "namespace": "cluster-resources", "name": "test-bucket-name", "reconcileID": "788046c2-8a99-4a3d-9619-4ce9d925d128", "error": "AccessDenied: Access Denied\n\tstatus code: 403, request id: F5VN6S14D1C1DMC4, host id: +bhom7Xc0sEoAexP7Xlx5XzLGxKzZusG1Xucs3OMyLXUvXWcJudGa6GK3QlHQq4PlJs6+nXN44I="}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.5/pkg/internal/controller/controller.go:329
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.5/pkg/internal/controller/controller.go:274
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.5/pkg/internal/controller/controller.go:235

Surely, some permissions are missing (the role doesn't have full access, but definitely has access to create/query a bucket), but this gives zero hints about what operation has failed.

Describe the solution you'd like It would be great if there was more context about what has failed exactly, so the relevant policy could be updated to allow that operation.

Describe alternatives you've considered I've tried to find the failing operation from the backtrace (doesn't seem to be usable at least for me) and finding it in Cloudtrail without success.

gecube commented 1 year ago

Hello!

What permission did you assign to s3 controller? I am also asking because I wonder how to narrow down it's permissions only to necessary ones.

bra-fsn commented 1 year ago

What permission did you assign to s3 controller? I am also asking because I wonder how to narrow down it's permissions only to necessary ones.

I use a combination of tag and name-based permissions, implemented as a boundary policy. The controller itself has arn:aws:iam::aws:policy/AdministratorAccess, but the boundary policy limits that to s3:* access for arn:aws:s3:::${var.short_prefix}-* resources (I'm trying to create a new bucket which matches that) along with others and extends the limits with:

actions   = ["*"]
    resources = ["*"]
    condition {
      test     = "ForAnyValue:StringEqualsIgnoreCase"
      variable = "aws:ResourceTag/Owner"
      values   = var.owner_tags
    }

but that should be irrelevant here.

So I guess it either wants to do a different operation (other than s3, but that might be unlikely if it works with the recommended policy, however it has s3-object-lambda:*, which I don't have), or something outside of the limited resource name.

bra-fsn commented 1 year ago

BTW, turning on debug doesn't really help either:

2023-07-28T12:01:48.376Z    DEBUG   ackrt   > r.Sync    {"account": "ACCOUNT", "role": "", "region": "us-east-1", "kind": "Bucket", "namespace": "cluster-resources", "name": "test-bucket-name", "generation": 2}
2023-07-28T12:01:48.376Z    DEBUG   ackrt   >> r.resetConditions    {"account": "ACCOUNT", "role": "", "region": "us-east-1", "kind": "Bucket", "namespace": "cluster-resources", "name": "test-bucket-name", "generation": 2}
2023-07-28T12:01:48.376Z    DEBUG   ackrt   << r.resetConditions    {"account": "ACCOUNT", "role": "", "region": "us-east-1", "kind": "Bucket", "namespace": "cluster-resources", "name": "test-bucket-name", "generation": 2}
2023-07-28T12:01:48.376Z    DEBUG   ackrt   >> rm.ResolveReferences {"account": "ACCOUNT", "role": "", "region": "us-east-1", "kind": "Bucket", "namespace": "cluster-resources", "name": "test-bucket-name", "is_adopted": false, "generation": 2}
2023-07-28T12:01:48.376Z    DEBUG   ackrt   << rm.ResolveReferences {"account": "ACCOUNT", "role": "", "region": "us-east-1", "kind": "Bucket", "namespace": "cluster-resources", "name": "test-bucket-name", "is_adopted": false, "generation": 2}
2023-07-28T12:01:48.376Z    DEBUG   ackrt   >> rm.EnsureTags    {"account": "ACCOUNT", "role": "", "region": "us-east-1", "kind": "Bucket", "namespace": "cluster-resources", "name": "test-bucket-name", "is_adopted": false, "generation": 2}
2023-07-28T12:01:48.376Z    DEBUG   ackrt   << rm.EnsureTags    {"account": "ACCOUNT", "role": "", "region": "us-east-1", "kind": "Bucket", "namespace": "cluster-resources", "name": "test-bucket-name", "is_adopted": false, "generation": 2}
2023-07-28T12:01:48.376Z    DEBUG   ackrt   >> rm.ReadOne   {"account": "ACCOUNT", "role": "", "region": "us-east-1", "kind": "Bucket", "namespace": "cluster-resources", "name": "test-bucket-name", "is_adopted": false, "generation": 2}
2023-07-28T12:01:48.376Z    DEBUG   ackrt   >>> rm.sdkFind  {"account": "ACCOUNT", "role": "", "region": "us-east-1", "kind": "Bucket", "namespace": "cluster-resources", "name": "test-bucket-name", "is_adopted": false, "generation": 2}
2023-07-28T12:01:48.653Z    DEBUG   ackrt   <<< rm.sdkFind  {"account": "ACCOUNT", "role": "", "region": "us-east-1", "kind": "Bucket", "namespace": "cluster-resources", "name": "test-bucket-name", "is_adopted": false, "generation": 2, "error": "AccessDenied: Access Denied\n\tstatus code: 403, request id: TWDERKK1Z4VMQ2GT, host id: HmtlvkwX3I+iWlAWL6ock+324vpq+vxubvDecubj8djeXzo3smJ7vTVXknTXJHbz4Zk0C7QIysE="}
2023-07-28T12:01:48.653Z    DEBUG   ackrt   << rm.ReadOne   {"account": "ACCOUNT", "role": "", "region": "us-east-1", "kind": "Bucket", "namespace": "cluster-resources", "name": "test-bucket-name", "is_adopted": false, "generation": 2, "error": "AccessDenied: Access Denied\n\tstatus code: 403, request id: TWDERKK1Z4VMQ2GT, host id: HmtlvkwX3I+iWlAWL6ock+324vpq+vxubvDecubj8djeXzo3smJ7vTVXknTXJHbz4Zk0C7QIysE="}

apart from this seems to be a failure in the discovery phase.

Looking at the code in manager.go and sdk.go I can now see what the problem is: the missing s3:ListAllMyBuckets permission (which the boundary policy lacked, because it doesn't have a resource parameter).

So my problem is solved, but the issue remains: instead of switching the controller into debug mode and having to read the actual code, it would be nicer if the normal level logs could contain the exact operation which fails.

@gecube, you should be fine with something like this for the narrowed down policy (or you could limit s3:* even further, the required API calls should be listed if you do a grep in the source for RecordAPICall):

statement {
    sid = "Wildcard"

    actions   = [
      "s3:ListAllMyBuckets",  # s3 ACK
    ]
    resources = ["*"]
}
statement {
    sid = "S3"
    actions = [
      "s3:*",
    ]
    resources = [
      "arn:aws:s3:::list_of_allowed_s3_buckets",
    ]
}
RedbackThomson commented 1 year ago

Checking back in here. You were able to resolve this by adding the extra permission?

bra-fsn commented 1 year ago

Yes, it works. Although the issue is about the error message, which isn't really helpful.

ack-bot commented 7 months ago

Issues go stale after 180d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 60d of inactivity and eventually close. If this issue is safe to close now please do so with /close. Provide feedback via https://github.com/aws-controllers-k8s/community. /lifecycle stale

ack-bot commented 1 month ago

Issues go stale after 180d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 60d of inactivity and eventually close. If this issue is safe to close now please do so with /close. Provide feedback via https://github.com/aws-controllers-k8s/community. /lifecycle stale

gecube commented 1 month ago

/remove-lifecycle stale