aws-cloudformation / cloudformation-coverage-roadmap

The AWS CloudFormation Public Coverage Roadmap
https://aws.amazon.com/cloudformation/
Creative Commons Attribution Share Alike 4.0 International
1.11k stars 57 forks source link

Support resources as proxies, and adoption/abdication #99

Open benkehoe opened 5 years ago

benkehoe commented 5 years ago

One of the custom resource patterns we use at iRobot is the concept of a proxy resource. This is a custom resource that allows you represent a resource in a CloudFormation template, like a DynamoDB table, including all of the return values that a native AWS::DynamoDB::Table has, taking the table name (or ARN) as its only parameter.

These proxy resources enable two things: first, a reduction in the number of template parameters, since one ARN can then be sufficient to produce multiple resource attributes that may be needed. But more importantly, it lets the template express the intended infrastructure graph more completely in the template.

With proxies, a further useful feature becomes possible: adoption and abdiction. Adoption is where a the resource represented by a proxy resource transitions to being owned by the CloudFormation stack it is in. Abdication is the opposite, where a resource created by a stack becomes a proxy resource, giving up control over it and allowing it to be managed independently (potentially by being adopted into a different stack). Everyone's had times where the way one's infrastructure has been carved out into stacks is later found to be suboptimal, but once there's data stored in it, there's nothing you can do about it. Adoption/abdication would change that.

benbridts commented 5 years ago

This may be better as two separate issues?

Adoption/abdication makes a lot of sense with proxy resources but is much more broadly usable. Eg. to import things set up outside of CloudFormation, or to convert Custom Resources to native resources.

benkehoe commented 5 years ago

I guess I wanted to encourage discussion as a single topic: how to better represent and manage resource ownership. I think adoption/abdication probably requires proxies.

benbridts commented 5 years ago

That's interesting, in my head it was the other way around. But I may be misunderstanding the proxy resources or the level of control you would like over the process.

Short example, let's assume I have a stack right now that is deployed with this resource in the template

ALogicalName:
  Type: AWS::SSM::Parameter
  Properties:
    Type: String
    Value: SomeValue

That creates a parameter with a physical resource id (in the case of a parameter this is not an Arn, but the idea is the same). Lets say AABBCCDDEE

Giving up control of that resource can be done by deploying two updates to the template. Once with

ALogicalName:
  Type: AWS::SSM::Parameter
  Properties:
    Type: String
    Value: SomeValue
  DeletionPolicy: Retain

and once with ALogicalName completely removed.

It would be nice if that could be done in one step, but I'm not sure that it's required.

At this point the AABBCCDDEE is unmanaged. Importing it into CloudFormation is currently not supported, but could take the reverse form.

A deploy with

TheSameOrAnotherLogicalName:
  Type: AWS::SSM::Parameter
  Properties:
    Type: String
    Value: SomeValue
  PhysicalResourceId: AABBCCDDEE

and because now it is in CloudFormation control again, a deploy with

TheSameOrAnotherLogicalName:
  Type: AWS::SSM::Parameter
  Properties:
    Type: String
    Value: SomeValue

I do admit that there are some edge case that this may not work with. But as long as CloudFormation only touches the underlying resources if their property changes (and the import does not count as a change), most of those can be worked around.

rafaelsales commented 5 years ago

It seems that this might be the biggest barrier to migrate from other InfrastructureAsCode tools to AWS-CDK / CloudFormation. Many apps in production can't afford a destroy/create step (especially for data sources like RDS) just to migrate infrastructure tooling within the same cloud service. Is this really an enhancement ticket?

zbintliff commented 5 years ago

It also makes managing incidents really hard. For example, using labor day weekends outage as an example. We had a single instance with an ebs volume, it was stateful and not part of ASG. The ebs volume was corrupted during the us-east-1 incident. We had to:

  1. stop instance
  2. detach volume
  3. create new volume from snapshot
  4. attach volume
  5. start instance again.

Trying to do all of those steps via CFN would have been brutal so the engineer made the steps using the CLI. But because step 3 is from the CLI we have to way to reconcile the stack and the "actual state of the world".

lmunro commented 5 years ago

This will also facilitate adopting a DynamoDB table that was restored from a backup. Currently There's no way to continue managing a restored dynamo table in cloudformation

benkehoe commented 5 years ago

in @ikben's example of adoption:

TheSameOrAnotherLogicalName:
  Type: AWS::SSM::Parameter
  Properties:
    Type: String
    Value: SomeValue
  PhysicalResourceId: AABBCCDDEE

What properties would you expect to be required? What would happen if they differ from the actual state of the resource?

For me, one version is when you want a proxy, in which no properties would be required, and any provided properties would be checked against the state of the resource, and the stack operation would fail if they differ. So the properties are like assertions. CloudFormation never changes the actual state of the resource.

The other version is adoption, where properties are required as with a normal resource, and during adoption those properties are set on the resource.

How would you expect to differentiate between a proxy and the first step in adoption?

I might expect adoption to not happen in the template itself, but perhaps as a mapping in the stack operation API call. e.g., in UpdateStack there would be an Adoptions parameter where I would have a mapping of logical id of a new resource in the template to the physical resource id I want to adopt for it, like TheSameOrAnotherLogicalName: AABBCCDDEE

For a proxy I would think it would be in the template itself, more like @ikben's example above.

benbridts commented 5 years ago

What properties would you expect to be required? What would happen if they differ from the actual state of the resource?

For me, one version is when you want a proxy, in which no properties would be required, and any provided properties would be checked against the state of the resource [...]

The other version is adoption, where properties are required as with a normal resource, and during adoption those properties are set on the resource.

How would you expect to differentiate between a proxy and the first step in adoption?

I would expect CloudFormation to do nothing during adoption. In my head it's the reverse of changing resources that are managed by CloudFormation manually. If you don't write your template to 100% match the actual state, you should be able to detect this with drift detection and then either change the resource, or remove it from the stack and try the adoption again.

In either case the next time you update the stack, CloudFormation could assume that its template is the source of truth to calculate the needed actions (just like right now).

If drift detection and adoption could be combined, that would be great, but in that case I think that should be as something you can do with every operation (not only for adopting resources).

If you look farther into the feature you might be right that both proxies and adoption could look very much alike, if you consider them smaller improvements (proxies as a new type of resource, adoption as a stack level operation as you describe below) they're harder to combine.

I might expect adoption to not happen in the template itself, but perhaps as a mapping in the stack operation API call. e.g., in UpdateStack there would be an Adoptions parameter where I would have a mapping of logical id of a new resource in the template to the physical resource id I want to adopt for it, like TheSameOrAnotherLogicalName: AABBCCDDEE

For a proxy I would think it would be in the template itself, more like @ikben's example above.

Yes, that does make sense. if you do it in the template, you can't reuse the same template with different stacks. Whereas proxy resources are more read-only shortcuts.

benkehoe commented 5 years ago

Adoption is now supported, called "resource import". It requires two steps to ensure the properties you've set for the imported resource are correct (run drift detection after import).

You can't import a resource into more than one stack; proxy resources would assist with this.

shantgup commented 2 years ago

This is has the potential to be something really cool!

It could be something like the FROM operative in a Dockerfile, like you can build on top of an existing resource type using the resource type as a starting point. So you would say something like, FROM AWS::DynamoDB::Table and then write out your modifications/additions.