AzBuilder / terrakube

Open source IaC Automation and Collaboration Software.
https://docs.terrakube.io
Apache License 2.0
446 stars 33 forks source link

Ability to use S3 without access/secret key #848

Open SaamerS opened 1 month ago

SaamerS commented 1 month ago

Feature description 💡

If running the containers (API/Registry/Executor) on native AWS hardware - there is no need to provide Access Key and Secret Key.

The containers can assume the native IAM role of the EC2/ECS/EKS. Therefore, the S3Client does not have to be initiated with access/secret key.

A toggle for this would be nice.

Anything else?

No response

alfespa17 commented 1 month ago

We added that feature recently, now Terrakube will support dynamic provider credentials.

More information can be found here:

https://docs.terrakube.io/user-guide/workspaces/dynamic-provider-credentials/aws-dynamic-provider-credentials

we will release a new stable version by the end of the month

alfespa17 commented 1 month ago

Wait I think you are referring for the storage backend right? not the credentials for the workspaces?

SaamerS commented 1 month ago

That is for the actual execution of the terraform - I am more speaking of the run time of the infrastructure.

for example:

https://github.com/AzBuilder/terrakube/blob/main/api/src/main/resources/application.properties#L103

hfeixas commented 1 month ago

storage backend right?

Yes - exactly, for setting up the storage backend. The inability of using the container's role to store state could violate some consumers security policy, forcing them to use EFS instead.

alfespa17 commented 1 month ago

I think that will require to change the S3 Client in all the components here

https://github.com/AzBuilder/terrakube/blob/ad2e54622a4bbebfda79ec79cf4fc8c5ebb71176/api/src/main/java/org/terrakube/api/plugin/storage/configuration/StorageTypeAutoConfiguration.java#L64

https://github.com/AzBuilder/terrakube/blob/ad2e54622a4bbebfda79ec79cf4fc8c5ebb71176/registry/src/main/java/org/terrakube/registry/plugin/storage/configuration/StorageAutoConfiguration.java#L70

https://github.com/AzBuilder/terrakube/blob/ad2e54622a4bbebfda79ec79cf4fc8c5ebb71176/executor/src/main/java/org/terrakube/executor/plugin/tfoutput/configuration/TerraformOutputAutoConfiguration.java#L66

https://github.com/AzBuilder/terrakube/blob/ad2e54622a4bbebfda79ec79cf4fc8c5ebb71176/executor/src/main/java/org/terrakube/executor/plugin/tfstate/configuration/TerraformStateAutoConfiguration.java#L75

And also update the backend state configuration here.

https://github.com/AzBuilder/terrakube/blob/ad2e54622a4bbebfda79ec79cf4fc8c5ebb71176/executor/src/main/java/org/terrakube/executor/plugin/tfstate/aws/AwsTerraformStateImpl.java#L71

I don't have experience with AWS but if someone would like to help with this feature all pull request are welcome.

igorbrites commented 1 month ago

Looking forward to it. Right now I have a Terraform module to deploy Terrakube on EKS, and I need to create a user and access/secret keys for it to use, but it would be great to use the IAM roles for service accounts. I'm not familiar with Java, but I can help with the helm chart when this gets implemented.

alfespa17 commented 1 month ago

I think this can be done using the something like the following:

public String handleRequest(S3Event s3Event, Context context) {

        String clientRegion = "eu-west-2";
        String targetRoleArn = "target-role-arn";
        String assumedRoleName = "target-assumed-role";

        AWSSecurityTokenService stsClient = AWSSecurityTokenServiceClientBuilder.standard().withRegion(clientRegion)
                .build();
        //build sts client, pass role to be assumed. 

        AssumeRoleRequest roleRequest = new AssumeRoleRequest().withRoleArn(targetRoleArn)
                .withRoleSessionName(assumedRoleName);

        AssumeRoleResult assumeRoleResult = stsClient.assumeRole(roleRequest);

        Credentials sessionCredentials = assumeRoleResult.getCredentials();

        BasicSessionCredentials basicSessionCredentials = new BasicSessionCredentials(
                sessionCredentials.getAccessKeyId(), sessionCredentials.getSecretAccessKey(),
                sessionCredentials.getSessionToken());

        AWSStaticCredentialsProvider credentialsProvider = new AWSStaticCredentialsProvider(basicSessionCredentials);

        try {

            AmazonS3 s3client = AmazonS3ClientBuilder.standard().withCredentials(credentialsProvider)
                    .withRegion(clientRegion).build(); //create service, with assumed credintails. 

            System.out.println(Arrays.toString(s3client.listBuckets().toArray())); //print buckets

        } catch (AmazonServiceException ase) {

            ase.printStackTrace();
        } catch (AmazonClientException ace) {

            ace.printStackTrace();
        } catch (Exception e) {
            e.printStackTrace();
        }

        return "SUCCESS";
    }

Reference: https://atulquest93.medium.com/aws-lambda-with-cross-account-access-using-assume-role-in-java-d6ad2b32b40b

igorbrites commented 1 month ago

Checking some Java apps that use IAM roles from service accounts, looks like it could just use the software.amazon.awssdk.auth.credentials.DefaultCredentialsProvider, that already checks if there are the right environment variables to use the desired credentials.

From the docs, this is the order it checks for credentials:

  1. Java System Properties - aws.accessKeyId and aws.secretAccessKey;
  2. Environment Variables - AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY;
  3. Web Identity Token credentials from system properties or environment variables (this is used by the IAM for Service Accounts);
  4. Credential profiles file at the default location (~/.aws/credentials) shared by all AWS SDKs and the AWS CLI;
  5. Credentials delivered through the Amazon EC2 container service if AWS_CONTAINER_CREDENTIALS_RELATIVE_URI environment variable is set and security manager has permission to access the variable;
  6. Instance profile credentials delivered through the Amazon EC2 metadata service.

I know that, when the devs test the app locally, they set their access/secret keys locally without changing the Java code, and the same app works on Kubernetes.

This way, whoever wants to keep using the access/secret keys could set them using AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY (or set the aws.accessKeyId and aws.secretAccessKey on the Java properties), and people using the IAM from service accounts (that mounts the right env vars to the pods, they being AWS_ROLE_ARN and AWS_WEB_IDENTITY_TOKEN_FILE) can add the role ARN to the service accounts annotations and profit.

Here's a snippet from the code I found:

DefaultCredentialsProvider credentialsProvider = DefaultCredentialsProvider.create();
Region region = Region.US_EAST_1;
S3Presigner presigner = S3Presigner.builder()
   .region(region)
   .credentialsProvider(credentialsProvider)
   .build();
alfespa17 commented 1 month ago

I did one small change in the registry logic to use the above logic, if someone would like to test the feature please use the following version. I have no idea if that will work because I don't have a way to test it.

https://github.com/AzBuilder/terrakube/releases/tag/2.21.1-alpha.1

It will work just for the registry container and you will have to add this environment variable and inside the logs you should see something like this message Using aws role authentication

AwsEnableRoleAuth=true
AwsRoleArn=XXXXX
AwsRoleSessionName=XXXXX

The code can be found in this branch

https://github.com/AzBuilder/terrakube/tree/awsrole

https://github.com/AzBuilder/terrakube/blob/4775308b594d171fceeb6f8e23862d80ab05a7a1/registry/src/main/java/org/terrakube/registry/plugin/storage/configuration/StorageAutoConfiguration.java#L77

lukasgomez commented 3 weeks ago

Hi! I've been following the issue and I tried to test the image 2.21.1-alpha.1 in the registry to check if the solution works. By now, I got the following error log:

Exception encountered during context initialization - cancelling refresh attempt: org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'moduleWebServiceImpl': Unsatisfied dependency expressed through field 'moduleService': Error creating bean with name 'moduleServiceImpl' defined in file [/workspace/BOOT-INF/classes/org/terrakube/registry/service/module/ModuleServiceImpl.class]: Unsatisfied dependency expressed through constructor parameter 1: Error creating bean with name 'terraformOutput' defined in class path resource [org/terrakube/registry/plugin/storage/configuration/StorageAutoConfiguration.class]: Failed to instantiate [org.terrakube.registry.plugin.storage.StorageService]: Factory method 'terraformOutput' threw exception with message: User: arn:aws:sts::<my-aws-account>:assumed-role/terrakube_role/aws-sdk-java-1718104872783 is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::<my-aws-account>:role/terrakube_role (Service: AWSSecurityTokenService; Status Code: 403; Error Code: AccessDenied; Request ID: 681c4140-2561-45c1-8733-73fb4da02e89; Proxy: null)

I can't create a trust relationship in the assumable role because the role arn:aws:sts::<my-aws-account>:assumed-role/terrakube_role/aws-sdk-java-1718104872783 always has a different number suffix. ¿Am I missing something or it is necessary to modify something in the logic of the code?

alfespa17 commented 3 weeks ago

Hello @lukasgomez did you add the three env variables to the container?

AwsEnableRoleAuth=true AwsRoleArn=XXXXX AwsRoleSessionName=XXXXX

Maybe you can check this part of the code, not sure if I missed something

https://github.com/AzBuilder/terrakube/blob/4775308b594d171fceeb6f8e23862d80ab05a7a1/registry/src/main/java/org/terrakube/registry/plugin/storage/configuration/StorageAutoConfiguration.java#L77

lukasgomez commented 3 weeks ago

Yes, I have added the env vars in the container. Following the message that @igorbrites has posted and asked some people, the app shouldn't assume any role from code, it should just discover the credentials from the environment. I would be happy to try it if you implement the change.

Checking some Java apps that use IAM roles from service accounts, looks like it could just use the software.amazon.awssdk.auth.credentials.DefaultCredentialsProvider, that already checks if there are the right environment variables to use the desired credentials.

From the docs, this is the order it checks for credentials:

  1. Java System Properties - aws.accessKeyId and aws.secretAccessKey;
  2. Environment Variables - AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY;
  3. Web Identity Token credentials from system properties or environment variables (this is used by the IAM for Service Accounts);
  4. Credential profiles file at the default location (~/.aws/credentials) shared by all AWS SDKs and the AWS CLI;
  5. Credentials delivered through the Amazon EC2 container service if AWS_CONTAINER_CREDENTIALS_RELATIVE_URI environment variable is set and security manager has permission to access the variable;
  6. Instance profile credentials delivered through the Amazon EC2 metadata service.

I know that, when the devs test the app locally, they set their access/secret keys locally without changing the Java code, and the same app works on Kubernetes.

This way, whoever wants to keep using the access/secret keys could set them using AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY (or set the aws.accessKeyId and aws.secretAccessKey on the Java properties), and people using the IAM from service accounts (that mounts the right env vars to the pods, they being AWS_ROLE_ARN and AWS_WEB_IDENTITY_TOKEN_FILE) can add the role ARN to the service accounts annotations and profit.

Here's a snippet from the code I found:

DefaultCredentialsProvider credentialsProvider = DefaultCredentialsProvider.create();
Region region = Region.US_EAST_1;
S3Presigner presigner = S3Presigner.builder()
   .region(region)
   .credentialsProvider(credentialsProvider)
   .build();
alfespa17 commented 3 weeks ago

@lukasgomez that code is using the SDK version 2 and Terrakube is still using the SDK version 1 that is why I added the environment variables to the container "AwsEnableRoleAuth", "AwsRoleArn" and "AwsRoleSessionName" instead of using the mention in the comment above I will try to check maybe there is some option to remove that suffix.

The code is not complex so I am not sure how to fix that "suffix" that you mention in the role name

 if(awsStorageServiceProperties.isEnableRoleAuthentication()) {
                    log.warn("Using aws role authentication");
                    AWSSecurityTokenService stsClient = AWSSecurityTokenServiceClientBuilder
                            .standard()
                            .withRegion(awsStorageServiceProperties.getRegion())
                            .build();

                    AssumeRoleRequest roleRequest = new AssumeRoleRequest()
                            .withRoleArn(awsStorageServiceProperties.getRoleArn())
                            .withRoleSessionName(awsStorageServiceProperties.getRoleSessionName());

                    AssumeRoleResult assumeRoleResult = stsClient.assumeRole(roleRequest);

                    com.amazonaws.services.securitytoken.model.Credentials sessionCredentials = assumeRoleResult.getCredentials();

                    BasicSessionCredentials basicSessionCredentials = new BasicSessionCredentials(
                            sessionCredentials.getAccessKeyId(), sessionCredentials.getSecretAccessKey(),
                            sessionCredentials.getSessionToken());

                    awsStaticCredentialsProvider= new AWSStaticCredentialsProvider(basicSessionCredentials);

                } else {
                    log.warn("Using aws access key and secret key for authentication");
                    AWSCredentials credentials = new BasicAWSCredentials(
                            awsStorageServiceProperties.getAccessKey(),
                            awsStorageServiceProperties.getSecretKey()
                    );
                    awsStaticCredentialsProvider = new AWSStaticCredentialsProvider(credentials);
                }
lukasgomez commented 3 weeks ago

I'm not sure if this will work, but maybe this workaround could solve the problem by now until you can start using the SDK 2

String roleSessionName = "terrakube_session";  

    AssumeRoleRequest roleRequest = new AssumeRoleRequest()
            .withRoleArn(awsStorageServiceProperties.getRoleArn())
            .withRoleSessionName(roleSessionName);
alfespa17 commented 3 weeks ago

I'm not sure if this will work, but maybe this workaround could solve the problem by now until you can start using the SDK 2

String roleSessionName = "terrakube_session";  

    AssumeRoleRequest roleRequest = new AssumeRoleRequest()
            .withRoleArn(awsStorageServiceProperties.getRoleArn())
            .withRoleSessionName(roleSessionName);

Hello @lukasgomez

The roleSessionName comes from the environment variable AwsRoleSessionName in the code, the value is read in this part of the code when the application is starting

https://github.com/AzBuilder/terrakube/blob/4775308b594d171fceeb6f8e23862d80ab05a7a1/registry/src/main/resources/application.properties#L42

So if you set the environment variable "AwsRoleSessionName=terrakube_session" should have the same effect here

https://github.com/AzBuilder/terrakube/blob/4775308b594d171fceeb6f8e23862d80ab05a7a1/registry/src/main/java/org/terrakube/registry/plugin/storage/configuration/StorageAutoConfiguration.java#L86

igorbrites commented 3 weeks ago

@alfespa17 how complex is the SDK upgrade? I know that (usually) AWS provides backward compatibility on their SDK, though I'm not a Java developer, so it's a genuine question.

SolomonHD commented 1 week ago

My organization would like this feature as well. Most likely you'll need to add at least one more environment variable, for the endpoint which gets queried for the role's permissions. I believe this is different per AWS region.

alfespa17 commented 1 week ago

@alfespa17 how complex is the SDK upgrade? I know that (usually) AWS provides backward compatibility on their SDK, though I'm not a Java developer, so it's a genuine question.

Not sure how complex will be to migrate to version 2 from version 1, we would have to change the api, executor and registry so that will require some time, update some dependencies and testing too.

Also if someone would like to help with this feature all help is welcome, I can do some coding for the change but I am not really familiar with AWS, I guess someone with more experience could add the "missing code" to make this work

alfespa17 commented 1 week ago

My organization would like this feature as well. Most likely you'll need to add at least one more environment variable, for the endpoint which gets queried for the role's permissions. I believe this is different per AWS region.

If you would like to help with this feature you could check this branch https://github.com/AzBuilder/terrakube/tree/awsrole.

To create a new terrakube image the steps can be found here

https://github.com/AzBuilder/terrakube/blob/main/scripts/build/terrakubeBuild.sh

In a high level you only need the following maven and the jdk for example this one to create the new image to test

# Install Java Dependencies
mvn clean install
SolomonHD commented 1 week ago

@alfespa17 Can you rebase that awsrole branch with main? My setup needs the recent http proxy changes.

alfespa17 commented 1 week ago

@alfespa17 Can you rebase that awsrole branch with main? My setup needs the recent http proxy changes.

Hello @SolomonHD I have updated the branch with the changes from version 2.21.3