meltwater / drone-cache

A Drone plugin for caching current workspace files between builds to reduce your build times
https://underthehood.meltwater.com/blog/2019/04/10/making-drone-builds-10-times-faster/
Apache License 2.0
335 stars 81 forks source link

Enable Assume Role from Drone Pipeline Step #248

Open hec-hi opened 1 year ago

hec-hi commented 1 year ago

In my DroneCI Pipeline, I am able to use drone-cache when providing IAM User credentials directly as environment variables. Example:

steps:
  - name: restore-cache
    image: meltwater/drone-cache
    environment:
      AWS_ACCESS_KEY_ID:
        from_secret: <DRONE_SECRET_AWS_ACCESS_KEY_ID>
      AWS_SECRET_ACCESS_KEY:
        from_secret: <DRONE_SECRET_AWS_SECRET_ACCESS_KEY>
    settings:
      <SETTINGS>

Due to compliance reasons, I would like to assume a role instead:

steps:
  - name: restore-cache
    image: meltwater/drone-cache
    environment:
      AWS_ASSUME_ROLE_ARN:
        from_secret: <DRONE_SECRET_ROLE_ARN>
    settings:
      <SETTINGS>

I see something in that direction was already implemented in https://github.com/meltwater/drone-cache/issues/142 , but setting the environment variable AWS_ASSUME_ROLE_ARN did not seem to work out-of-the-box for me. I get EmptyStaticCreds: static credentials are empty. Maybe this is already implemented, and I am just missing something 😄

Thanks in advance!

bdebyl commented 1 year ago

From my understanding you still need an access key and secret key for a specific user, human or "robot", to have some credentials to assume a role with. If you are self-hosting your Drone solution then this should work if the self-hosted Drone is on an EC2 Instance and assigning the instance a Role: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html

Otherwise, if outside of AWS, without any credentials (Access + Secret Key) there is no way I know of to assume a role.

hec-hi commented 1 year ago

Thanks for the answer 😄 Yes, exactly, I'm hosting Drone in the AWS and the instance has an instance profile/IAM Role attached to it. I am using Drone for multi-account deployment and each "target" account has a CI-Role that is assumed, for example with our own fork of https://github.com/nodefortytwo/drone-aws-role-auth

The cross-account assuming of the target role works in other steps, so the trust relationship between the Drone instance profile role and the target role is in place.

Did I get it right? The existence of an environment variable called AWS_ASSUME_ROLE_ARN (or PLUGIN_ASSUME_ROLE_ARN, see https://github.com/meltwater/drone-cache/blob/master/README.md) should trigger the drone-cache plugin to handle the assuming of a role already? I.e. it should already be able to fetch AWS credentials (access key, secret access key and token)?

hec-hi commented 1 year ago

Also, an interesting behavior I observed: any time I provide AWS_ASSUME_ROLE_ARN I end up having the EmptyStaticCreds: static credentials are empty error. Even if my step looks like

steps:
  - name: restore-cache
    image: meltwater/drone-cache
    environment:
      AWS_ACCESS_KEY_ID:
        from_secret: <DRONE_SECRET_AWS_ACCESS_KEY_ID>
      AWS_SECRET_ACCESS_KEY:
        from_secret: <DRONE_SECRET_AWS_SECRET_ACCESS_KEY>
      AWS_ASSUME_ROLE_ARN:
        from_secret: <DRONE_SECRET_ROLE_ARN>
    settings:
      <SETTINGS>

If I remove AWS_ASSUME_ROLE_ARN the plugin works again.

bdebyl commented 1 year ago

That's interesting and it should work, but perhaps we never added the feature of it being used within a self-hosted AWS environment after all. I've never seen the EmptyStaticCreds error and searches, as I'm sure you're aware, don't really clarify what this means aside from the relevant source code declaration and usage(s): https://github.com/aws/aws-sdk-go/blob/main/aws/credentials/static_provider.go#L12

I'm wondering if this could also be a good reason to upgrade to aws-sdk-go-v2.

Thanks for finding this issue, I will look into it.

Edit: Fix would require changes to the following code block, as it seems to explicitly expect Access and Secret keys as opposed to realizing it's an EC2 instance with IAM roles assigned: https://github.com/meltwater/drone-cache/blob/master/storage/backend/s3/s3.go#L54-L69

bdebyl commented 1 year ago

@hec-hi out of curiosity, if you remove all the AWS_ environment variables and followed the EC2 Role assignment in AWS does it work then or does it still kick back EmptyStaticCreds?

Wondering if there is something we can add to ensure it uses Instance Profiles if in EC2:

steps:
  - name: restore-cache
    image: meltwater/drone-cache
    environment:
      PLUGIN_BACKEND: s3
    settings:
      <SETTINGS>
hec-hi commented 1 year ago

@bdebyl thanks a lot for following up 😉

I ran a few combinations: 1) Setting AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and AWS_ASSUME_ROLE_ARN resulted in EmptyStaticCreds 2) Setting AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY resulted in success 3) Setting only AWS_ASSUME_ROLE_ARN resulted in EmptyStaticCreds 4) Setting none of the above (in fact setting no environment: at all) resulted in success

So it seems that in example 4) the plugin is able to use whatever IAM Role is attached to the Drone Host EC2 Instance. In the case that this Role has a Policy attached to it that allows communication with the S3 Bucket, it seems to work.

That's already really nice, but means that any other unrelated pipelines running in this instance will in theory be able to write/read to that cache Bucket.

I guess the desired behavior would be: 1) When there is an IAM Role attached to the instance and only AWS_ASSUME_ROLE_ARN is set, then drone-cache should use the current EC2 Role to try to assume this role and fetch AWS Access Key, Secret Access Key and Token. 2) If AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and AWS_ASSUME_ROLE_ARN are set, then drone-cache should use the credentials provided to try and assume this role and fetch (a new) AWS Access Key, (a new) Secret Access Key and Token.

Does that make sense? :pray:

bdebyl commented 1 year ago

Yes that's what I was thinking. This way the IAM Role assigned to the EC2 instance could be limited, but the Role ARN to assume could include that S3 bucket.

Ultimately this doesn't add any more security to the EC2 Role being able to read/write to that cache bucket as it could just assume the role to do so, but I guess this might as well be a nice-to-have.

Should require changing the validation in the assume role code to not expect static creds (Access + Secret keys)

hec-hi commented 1 year ago

Yes, I agree it doesn't add more security since a pipeline developer can still write/read to that bucket if the ARN of the role to assume is known. Nevertheless, as you said, it is nice to have the option of "hiding" the ARN from the developer while limiting the default EC2 Role to the least privilege.