aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.37k stars 3.77k forks source link

feat(DataSync): Deploy DataSync agent on Amazon EC2 L2 construct #28701

Open annguyen36 opened 5 months ago

annguyen36 commented 5 months ago

Describe the feature

It would be nice to have some L2 constructs for DataSync. Specially for creating the DataSync agent on EC2 instance.

Use Case

Currently only have L1 construct to create DataSync agent which requires the Activation Key. The situation is when the Agent is deployed in the EC2 instance which is part of the CDK deployment, we will need to make the HTTP request to get the ActivationKey, then use it to create the DataSync Agent.

Proposed Solution

Currently the solution is using custom resource to make this possible in a single CDK deployment. Specifically the Lambda function will make the HTTP call to get the Activation Key and send back to CloudFormation to continue create DataSync Agent resource. This will be nice to have this abstract in the L2 construct.

Other Information

No response

Acknowledgements

CDK version used

2.118

Environment details (OS name and version, etc.)

MacOs

annguyen36 commented 5 months ago

waiting for triage before working on PR

michaelciccarelli commented 5 months ago

I understand that this can be addressed by writing a lambda but I think the point is to allow cdk to have this functionality natively via cdk.. this would be much easier, quicker to release multiple agents across several AWS accounts. For example, we want to deploy agents across dev, test and production for separation of data and not having this functionality built into CDK makes the deployments much more difficult. MIssing piece is just being able to natively retrieve the agent registration key.

pahud commented 5 months ago

@annguyen36 Before you start the PR draft, can you share a little bit about how would you implement the solution? I am not sure if this is a good idea to add this support in aws-ec2 modules. Can you share some high level abstraction and code samples?

github-actions[bot] commented 5 months ago

This issue has not received a response in a while. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.

annguyen36 commented 5 months ago

Hi @pahud, I believe this should fall under the datasync module instead of ec2. So, basically to create the DataSync CfnAgent we will need the activation key. In the case that the agent is deployed in ec2 instance, there are couple ways to get the activation key (via console, ssh to the instance, or using cli). We can encapsulate this process using custom resource to make the http request. So my idea is to have the L2 construct to create EC2DataSyncAgent which contain:

Constructs idea:

    instance = ec2.Instance(self, "Instance",
            vpc=vpc,
            instance_type=ec2.InstanceType("t2.micro"),
            machine_image=ec2.MachineImage.from_ssm_parameter('/aws/service/datasync/ami'),
            security_group=sg,
            key_name="us-west-2",
        )

        on_event = lambda_.Function(self, "Function",
            runtime=lambda_.Runtime.NODEJS_18_X,
            handler="index.handler",
            code=lambda_.Code.from_asset('lambda'),
            timeout=Duration.seconds(300) 
        )
        get_activation_key = CustomResource(
            self, "GetActivationKey",
            # service_token=my_provider.service_token,
            service_token=on_event.function_arn,
            properties={
                "agentIpAddress": instance.instance_public_ip, < this will depend if agent using vpc endpoint
            }
        )
        # Create DataSync agent
        datasync.CfnAgent(self, "DataSyncAgent",
            activation_key=get_activation_key.get_att("ActivationKey").to_string(),
            agent_name="DataSyncAgent"
        )
pahud commented 5 months ago

I feel we probably should create an Agent L2 construct that auto generates the activation key under the hood if undefined. For example

new dataSync.Agent(this, 'Agent', {
  activationKey: dataSync.ActivationKey.fromEc2Instance(instance)
});

But this needs some discussion with the core team maintainer.

amouly commented 3 months ago

Hi @pahud, I believe this should fall under the datasync module instead of ec2. So, basically to create the DataSync CfnAgent we will need the activation key. In the case that the agent is deployed in ec2 instance, there are couple ways to get the activation key (via console, ssh to the instance, or using cli). We can encapsulate this process using custom resource to make the http request. So my idea is to have the L2 construct to create EC2DataSyncAgent which contain:

  • the ec2 instance using the datasync ami
  • A lambda function and custom resource which used to make the http request
  • and use the Activation key to create CfnAgent.

Constructs idea:

    instance = ec2.Instance(self, "Instance",
            vpc=vpc,
            instance_type=ec2.InstanceType("t2.micro"),
            machine_image=ec2.MachineImage.from_ssm_parameter('/aws/service/datasync/ami'),
            security_group=sg,
            key_name="us-west-2",
        )

        on_event = lambda_.Function(self, "Function",
            runtime=lambda_.Runtime.NODEJS_18_X,
            handler="index.handler",
            code=lambda_.Code.from_asset('lambda'),
            timeout=Duration.seconds(300) 
        )
        get_activation_key = CustomResource(
            self, "GetActivationKey",
            # service_token=my_provider.service_token,
            service_token=on_event.function_arn,
            properties={
                "agentIpAddress": instance.instance_public_ip, < this will depend if agent using vpc endpoint
            }
        )
        # Create DataSync agent
        datasync.CfnAgent(self, "DataSyncAgent",
            activation_key=get_activation_key.get_att("ActivationKey").to_string(),
            agent_name="DataSyncAgent"
        )

I would like to implement this solution on my end, due I need to implement DataSync for GCS->S3 on my side.

Do you have a repository or something with this solution?