aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.59k stars 3.89k forks source link

Error when adding s3ImportBucket to RDS Aurora Postgresql #8201

Closed simon-dk closed 4 years ago

simon-dk commented 4 years ago

When adding an s3ImportBucket to a standard (non-severless) RDS Aurora Postgresql cluster an error occurs.

When omitting the last line, the cluster works as expected.

    const importBucket = new s3.Bucket(this, 'importbucket');
    const cluster = new rds.DatabaseCluster(this, "Database", {
      engine: rds.DatabaseClusterEngine.AURORA_POSTGRESQL,
      masterUser: { username: "a-username-here" },
      instanceProps: {
        instanceType: ec2.InstanceType.of(
          ec2.InstanceClass.T3,
          ec2.InstanceSize.MEDIUM
        ),
        vpc: props?.vpc!,
        vpcSubnets: { subnetType: ec2.SubnetType.ISOLATED },
      },
      defaultDatabaseName: "a-db-name-here",

      parameterGroup: rds.ParameterGroup.fromParameterGroupName(
        this,
        "ParameterGroup",
        "default.aurora-postgresql11"
      ),
      instances: 1,
      s3ImportBuckets: [importBucket], // <- this creates an error
    });

The error-log shows an error regarding the feature-name: The feature-name parameter must be provided with the current operation for the Aurora (PostgreSQL) engine. (Service: AmazonRDS; Status Code: 400; Error Code: InvalidParameterValue; Request ID: XXX) new DatabaseCluster (/cdkpath/node_modules/@aws-cdk/aws-rds/lib/cluster.ts:438:21)

Environment


This is :bug: Bug Report

nija-at commented 4 years ago

The bug is coming from somewhere in here - https://github.com/aws/aws-cdk/blob/0028778c0f00f2faa8dad25345cd17f311fad5da/packages/%40aws-cdk/aws-rds/lib/cluster.ts#L407-L412

We're only setting the roleArn in AssociatedRoles property. It's possible that Postgres requires the FeatureName property to also be set.

simon-dk commented 4 years ago

I think you are right. Found a similar terraform issue that mentions that featureName should be present for PostgreSQL although CloudFormation documentations doesn’t set this as a required parameter: https://github.com/terraform-providers/terraform-provider-aws/issues/9552

jonny-rimek commented 4 years ago

is their a workaround? I never tried manipulating the underlying CFN before like here, but I don't get how I can access the variable.

const cfn = auroraPostgres.node.defaultChild as CfnDBCluster
cfn.associatedRoles

as far as I can tell it's inside associatedRoles, but I have no idea how to access it.

Importing csv fiels is a central part of the project I'm building, so this is quite a bummer for me. Maybe it will be fixed in the next release or two, as it is in progress.

simon-dk commented 4 years ago

I ended up adding the role manually. So in my rds stack i have these lines. After deployment i go to the RDS aws console and add the s3import role manually. Still a bug though.

    const importBucket = new s3.Bucket(this, "importBucket", {});

    const role = new iam.Role(this, "Role", {
      assumedBy: new iam.ServicePrincipal("rds.amazonaws.com"), // required
    });

    role.addToPolicy(
      new iam.PolicyStatement({
        effect: iam.Effect.ALLOW,
        resources: [importBucket.bucketArn, `${importBucket.bucketArn}/*`],
        actions: ["s3:GetObject", "s3:ListBucket"],
      })
    );

    /* Database cluster */
    const cluster = new rds.DatabaseCluster(this, "Database", {
      engine: rds.DatabaseClusterEngine.AURORA_POSTGRESQL,

      masterUser: {
        username: "clusteradmin",
      },

      instanceProps: {
        instanceType: ec2.InstanceType.of(
          ec2.InstanceClass.T3,
          ec2.InstanceSize.MEDIUM
        ),
        vpc: props?.vpc!,
        vpcSubnets: {
          subnetType: ec2.SubnetType.ISOLATED,
        },
      },
      defaultDatabaseName: "main",

      parameterGroup: rds.ParameterGroup.fromParameterGroupName(
        this,
        "ParameterGroup",
        "default.aurora-postgresql11"
      ),
      instances: 1,
      removalPolicy: cdk.RemovalPolicy.RETAIN,
    });
jonny-rimek commented 4 years ago

thanks for sharing your workaround @Simon-SDK .

I don't understand why can't assign the role in CDK, can you elaborate, please?

simon-dk commented 4 years ago

I believe the "import-role" featurename is a special IAM role that RDS Postgres uses to access S3, so although you can create the role, you cant assign it from CDK. And because CDK doesn't "know" that it should create the featurename when you add a importrole or importbuckets to your cluster the creation simply fails. So its just a bug :-)

jonny-rimek commented 4 years ago

@Simon-SDK it works like a charm thanks for your help. Pretty impressive how fast the import is. I load a ~7mb csv with 100k lines into the db in under a second, with the smallest instance.

looks like we can expect the fix soon a pr is open.

simon-dk commented 4 years ago

@jonny-rimek Glad I could help :-) The S3 import is wicked fast, you can import several gigabytes in a minute or so on a medium instance.