aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.39k stars 3.79k forks source link

awsrds: NewDatabaseInstanceReadReplica fails with "DBInstance ... not found" #28998

Open advdv opened 5 months ago

advdv commented 5 months ago

Describe the bug

I'm trying to use the AWS CDK to create a read replica (in Singapore) for a database instance in another region (Frankfurt). Since the awsrds.NewDatabaseInstanceReadReplica requires a awsrds.IDatabaseInstance for the SourceDataInstance I'm using DatabaseInstance_FromDatabaseInstanceAttributes to "import" a database instance from the other region. This uses cross-region references to provide the attributes. But when I do this ,the deploy will fail with "DBInstance clcorefra-postgresinstance95a4e08e-v383fggzzyhn not found" when creating the replica instance.

I've test that this is possible by doing it manually in the AWS console, nothing wrong with the source instance.

Expected Behavior

It should create a replica instance in the destination region (Singapore).

Current Behavior

11:47:00 AM | CREATE_FAILED        | AWS::RDS::DBInstance                          | PostgresReadReplicaDD87954B
Resource handler returned message: "DBInstance clcorefra-postgresinstance95a4e08e-v383fggzzyhn not found. (Service: R
ds, Status Code: 404, Request ID: 407c73ab-c968-4079-bf88-47490e6e1ab8)" (RequestToken: d4f19c88-d1ec-1caa-8682-32d31
7c35894, HandlerErrorCode: NotFound)

 ❌  ClCoreSin failed: Error: The stack named ClCoreSin failed to deploy: UPDATE_ROLLBACK_COMPLETE: Resource handler returned message: "DBInstance clcorefra-postgresinstance95a4e08e-v383fggzzyhn not found. (Service: Rds, Status Code: 404, Request ID: 407c73ab-c968-4079-bf88-47490e6e1ab8)" (RequestToken: d4f19c88-d1ec-1caa-8682-32d317c35894, HandlerErrorCode: NotFound)
    at FullCloudFormationDeployment.monitorDeployment (/Users/adam/node_modules/aws-cdk/lib/index.js:428:10615)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async Object.deployStack2 [as deployStack] (/Users/adam/node_modules/aws-cdk/lib/index.js:431:196745)
    at async /Users/adam/node_modules/aws-cdk/lib/index.js:431:178714

 ❌ Deployment failed: Error: The stack named ClCoreSin failed to deploy: UPDATE_ROLLBACK_COMPLETE: Resource handler returned message: "DBInstance clcorefra-postgresinstance95a4e08e-v383fggzzyhn not found. (Service: Rds, Status Code: 404, Request ID: 407c73ab-c968-4079-bf88-47490e6e1ab8)" (RequestToken: d4f19c88-d1ec-1caa-8682-32d317c35894, HandlerErrorCode: NotFound)
    at FullCloudFormationDeployment.monitorDeployment (/Users/adam/node_modules/aws-cdk/lib/index.js:428:10615)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async Object.deployStack2 [as deployStack] (/Users/adam/node_modules/aws-cdk/lib/index.js:431:196745)
    at async /Users/adam/node_modules/aws-cdk/lib/index.js:431:178714

The stack named ClCoreSin failed to deploy: UPDATE_ROLLBACK_COMPLETE: Resource handler returned message: "DBInstance clcorefra-postgresinstance95a4e08e-v383fggzzyhn not found. (Service: Rds, Status Code: 404, Request ID: 407c73ab-c968-4079-bf88-47490e6e1ab8)" (RequestToken: d4f19c88-d1ec-1caa-8682-32d317c35894, HandlerErrorCode: NotFound)

Reproduction Steps

This project uses some of our internal libraries that we use for CDK, it is public: gitHub.com/crewlinker/clgo

/infra/infra.go

// Package main will run the CDK synthesis.
package main

import (
    "github.com/aws/aws-cdk-go/awscdk/v2"
    "github.com/aws/jsii-runtime-go"
    "github.com/crewlinker/clgo/clcdk"
    "github.com/crewlinker/core/infra/infracon"
)

func main() {
    defer jsii.Close()

    app := awscdk.NewApp(nil)

    main := infracon.NewRoot(clcdk.NewRegionalSingletonStack(app, "eu-central-1", "Fra"), nil) // contains source instance
    infracon.NewRoot(clcdk.NewRegionalSingletonStack(app, "ap-southeast-1", "Sin"), main) // should create read replica

    app.Synth(nil)
}

infra/infracon/root.go

package infracon

import (
    "github.com/aws/aws-cdk-go/awscdk/v2/awsec2"
    "github.com/aws/constructs-go/constructs/v10"
    "github.com/aws/jsii-runtime-go"
)

type root struct {
    network             Network
    cluster             Cluster
    instanceA           Instance
    postgresInstance    PostgresInstance
    postgresReadReplica PostgresReadReplica
}

// Root construct's interface.
type Root interface {
    PostgresInstance() PostgresInstance
}

// NewRoot inits the root construct for our regional stack.
func NewRoot(scope constructs.Construct, main Root) Root {
    con := root{}
    con.network = NewNetwork(scope)
    con.cluster = NewCluster(scope, con.network.VPC())
    con.instanceA = NewInstance(scope, con.network.VPC(), con.cluster.Cluster(),
        awsec2.NewInstanceType(jsii.String(`t4g.small`)), "A")

    if main == nil {
        con.postgresInstance = NewPostgresInstance(scope, con.network.VPC())
    } else {
        con.postgresInstance = main.PostgresInstance()
        con.postgresReadReplica = NewPostgresReadReplica(scope, con.network.VPC(),
            con.postgresInstance.DatabaseInstanceFromCrossRegionRefs(scope))
    }

    return con
}

// PostgresInstance implements instance providing.
func (con root) PostgresInstance() PostgresInstance {
    return con.postgresInstance
}

infra/infracon/postgres.go

package infracon

import (
    "github.com/aws/aws-cdk-go/awscdk/v2"
    "github.com/aws/aws-cdk-go/awscdk/v2/awsec2"
    "github.com/aws/aws-cdk-go/awscdk/v2/awslogs"
    "github.com/aws/aws-cdk-go/awscdk/v2/awsrds"
    "github.com/aws/constructs-go/constructs/v10"
    "github.com/aws/jsii-runtime-go"
)

// primary postgres construct.
type postgresInstance struct {
    secret        awsrds.DatabaseSecret
    securityGroup awsec2.ISecurityGroup
    parameters    awsrds.ParameterGroup
    instance      awsrds.DatabaseInstance
}

const (
    // port that postgres will be served on.
    postgresPort = 5432
    // pretty fine-grained monitoring by default.
    monitorIntervalSeconds = 15
    // keep backups for 7 days.
    backupRetentionDays = 7
)

// postgresEngine declares the instance engine.
func postgresEngine() awsrds.IInstanceEngine {
    return awsrds.DatabaseInstanceEngine_Postgres(&awsrds.PostgresInstanceEngineProps{
        Version: awsrds.PostgresEngineVersion_VER_16_1(),
    })
}

// PostgresInstance construct.
type PostgresInstance interface {
    DatabaseInstanceFromCrossRegionRefs(scope constructs.Construct) awsrds.IDatabaseInstance
}

// DatabaseInstanceFromCrossRegionRefs returns instance reference from cross-region references.
func (con postgresInstance) DatabaseInstanceFromCrossRegionRefs(scope constructs.Construct) awsrds.IDatabaseInstance {
    return awsrds.DatabaseInstance_FromDatabaseInstanceAttributes(scope, jsii.String("ImportedPostgresInstance"),
        &awsrds.DatabaseInstanceAttributes{
            InstanceEndpointAddress: con.instance.DbInstanceEndpointAddress(),
            InstanceIdentifier:      con.instance.InstanceIdentifier(),
            InstanceResourceId:      con.instance.InstanceResourceId(),
            Port:                    jsii.Number(postgresPort),
            Engine:                  postgresEngine(),
            SecurityGroups:          con.instance.Connections().SecurityGroups(),
        })
}

// NewPostgresInstance provides the primary instance.
func NewPostgresInstance(scope constructs.Construct, vpc awsec2.IVpc) PostgresInstance {
    scope, con := constructs.NewConstruct(scope, jsii.String("PostgresInstance")),
        postgresInstance{}

    con.secret = awsrds.NewDatabaseSecret(scope, jsii.String("Secret"), &awsrds.DatabaseSecretProps{
        Username: jsii.String("postgres"),
    })

    con.securityGroup = awsec2.NewSecurityGroup(scope, jsii.String("SecurityGroup"), &awsec2.SecurityGroupProps{
        Vpc:              vpc,
        AllowAllOutbound: jsii.Bool(true),
    })

    con.securityGroup.AddIngressRule(
        awsec2.Peer_AnyIpv4(),
        awsec2.Port_Tcp(jsii.Number(postgresPort)),
        jsii.String("allow all inbound access to postgres"), jsii.Bool(false))

    con.parameters = awsrds.NewParameterGroup(scope, jsii.String("ParameterGroup"), &awsrds.ParameterGroupProps{
        Engine: postgresEngine(),
        Parameters: &map[string]*string{
            "rds.force_ssl":           jsii.String("1"),
            "rds.logical_replication": jsii.String("1"),
        },
    })

    con.instance = awsrds.NewDatabaseInstance(scope, jsii.String("Instance"), &awsrds.DatabaseInstanceProps{
        RemovalPolicy:      awscdk.RemovalPolicy_SNAPSHOT,
        DeletionProtection: jsii.Bool(true),

        Engine:              postgresEngine(),
        InstanceType:        awsec2.InstanceType_Of(awsec2.InstanceClass_BURSTABLE4_GRAVITON, awsec2.InstanceSize_MICRO),
        Vpc:                 vpc,
        AllocatedStorage:    jsii.Number(10), // GiB
        MaxAllocatedStorage: jsii.Number(20), // GiB

        // We use a reference to a secret we create ourselves so we can easily look it up in other stacks (byname)
        Credentials: awsrds.Credentials_FromSecret(con.secret, jsii.String("postgres")),

        // Each instance will have performance insight enabled and is publicly accessible
        IamAuthentication:           jsii.Bool(true),
        AutoMinorVersionUpgrade:     jsii.Bool(true),
        VpcSubnets:                  &awsec2.SubnetSelection{SubnetType: awsec2.SubnetType_PUBLIC},
        PubliclyAccessible:          jsii.Bool(true),
        SecurityGroups:              con.securityGroup.Connections().SecurityGroups(),
        EnablePerformanceInsights:   jsii.Bool(true),
        PerformanceInsightRetention: awsrds.PerformanceInsightRetention_DEFAULT,

        // enables enhanced monitoring
        MonitoringInterval: awscdk.Duration_Seconds(jsii.Number(monitorIntervalSeconds)),
        // update to higher-security RSA certifiacte that doesn't expire in 2024
        CaCertificate: awsrds.CaCertificate_RDS_CA_RDS4096_G1(),

        // We export postgres logs to cloudwatch so we can add alarms if we want to.
        CloudwatchLogsExports:   jsii.Strings("postgresql"),
        CloudwatchLogsRetention: awslogs.RetentionDays_TWO_WEEKS,
        // backups for disaster recovery
        BackupRetention: awscdk.Duration_Days(jsii.Number(backupRetentionDays)),
        // we only allow tls connections since the password will travel over the public internet
        ParameterGroup: con.parameters,
    })

    return con
}

// postgresReadReplica data.
type postgresReadReplica struct {
    securityGroup awsec2.ISecurityGroup
    parameters    awsrds.ParameterGroup
    replica       awsrds.DatabaseInstanceReadReplica
}

// PostgresReadReplica construct.
type PostgresReadReplica interface{}

// NewPostgresReadReplica will setup a read replica instance.
func NewPostgresReadReplica(
    scope constructs.Construct,
    vpc awsec2.IVpc,
    sourceInstance awsrds.IDatabaseInstance,
) PostgresReadReplica {
    scope, con := constructs.NewConstruct(scope, jsii.String("PostgresReadReplica")), postgresReadReplica{}

    con.securityGroup = awsec2.NewSecurityGroup(scope, jsii.String("SecurityGroup"), &awsec2.SecurityGroupProps{
        Vpc:              vpc,
        AllowAllOutbound: jsii.Bool(true),
    })

    con.securityGroup.AddIngressRule(
        awsec2.Peer_AnyIpv4(),
        awsec2.Port_Tcp(jsii.Number(postgresPort)),
        jsii.String("allow all inbound access to postgres"), jsii.Bool(false))

    con.parameters = awsrds.NewParameterGroup(scope, jsii.String("ParameterGroup"), &awsrds.ParameterGroupProps{
        Engine: postgresEngine(),
        Parameters: &map[string]*string{
            "rds.force_ssl":           jsii.String("1"),
            "rds.logical_replication": jsii.String("1"),
        },
    })
        // NOTE: This will fail with "instance not found"
    con.replica = awsrds.NewDatabaseInstanceReadReplica(scope, jsii.String("Replica"),
        &awsrds.DatabaseInstanceReadReplicaProps{
            SourceDatabaseInstance: sourceInstance,
            InstanceType:           awsec2.InstanceType_Of(awsec2.InstanceClass_BURSTABLE4_GRAVITON, awsec2.InstanceSize_MICRO),
            Vpc:                    vpc,

            // Each instance will have performance insight enabled and is publicly accessible
            IamAuthentication:           jsii.Bool(true),
            AutoMinorVersionUpgrade:     jsii.Bool(true),
            VpcSubnets:                  &awsec2.SubnetSelection{SubnetType: awsec2.SubnetType_PUBLIC},
            PubliclyAccessible:          jsii.Bool(true),
            SecurityGroups:              con.securityGroup.Connections().SecurityGroups(),
            EnablePerformanceInsights:   jsii.Bool(true),
            PerformanceInsightRetention: awsrds.PerformanceInsightRetention_DEFAULT,

            // enables enhanced monitoring
            MonitoringInterval: awscdk.Duration_Seconds(jsii.Number(monitorIntervalSeconds)),
            // update to higher-security RSA certifiacte that doesn't expire in 2024
            CaCertificate: awsrds.CaCertificate_RDS_CA_RDS4096_G1(),

            // We export postgres logs to cloudwatch so we can add alarms if we want to.
            CloudwatchLogsExports:   jsii.Strings("postgresql"),
            CloudwatchLogsRetention: awslogs.RetentionDays_TWO_WEEKS,
            // we only allow tls connections since the password will travel over the public internet
            ParameterGroup: con.parameters,
        })

    return con
}

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.126.0 (build fb74c41)

Framework Version

No response

Node.js Version

Node.js v20.10.0

OS

Apple M1 Sonoma 14.1

Language

Go

Language Version

go version go1.21.6 darwin/arm64

Other information

No response

advdv commented 5 months ago

It seems that a work-around for this is setting the attribute on the underly Cfn construct like shown below. For cross-region read replica this needs to be a fully formed ARN.

    con.replica = awsrds.NewDatabaseInstanceReadReplica(scope, jsii.String("Replica"),
        &awsrds.DatabaseInstanceReadReplicaProps{
            SourceDatabaseInstance: sourceInstance,
            InstanceType: awsec2.InstanceType_Of(
                awsec2.InstanceClass_BURSTABLE4_GRAVITON, awsec2.InstanceSize_MICRO),
            Vpc:                             vpc,
            StorageEncryptionKey:            con.key,
            PerformanceInsightEncryptionKey: con.key,

            // Each instance will have performance insight enabled and is publicly accessible
            IamAuthentication:           jsii.Bool(true),
            AutoMinorVersionUpgrade:     jsii.Bool(true),
            VpcSubnets:                  &awsec2.SubnetSelection{SubnetType: awsec2.SubnetType_PUBLIC},
            PubliclyAccessible:          jsii.Bool(true),
            SecurityGroups:              con.securityGroup.Connections().SecurityGroups(),
            EnablePerformanceInsights:   jsii.Bool(true),
            PerformanceInsightRetention: awsrds.PerformanceInsightRetention_DEFAULT,

            // enables enhanced monitoring
            MonitoringInterval: awscdk.Duration_Seconds(jsii.Number(monitorIntervalSeconds)),
            // update to higher-security RSA certifiacte that doesn't expire in 2024
            CaCertificate: awsrds.CaCertificate_RDS_CA_RDS4096_G1(),

            // We export postgres logs to cloudwatch so we can add alarms if we want to.
            CloudwatchLogsExports:   jsii.Strings("postgresql"),
            CloudwatchLogsRetention: awslogs.RetentionDays_TWO_WEEKS,
            // we only allow tls connections since the password will travel over the public internet
            ParameterGroup: con.parameters,
        })

    dbi, ok := con.replica.Node().DefaultChild().(awsrds.CfnDBInstance)
    if !ok {
        panic("replica is not awsrds.CfnDbInstance")
    }

    // we need to make sure to pass an arn, and the source region. See:
    //nolint:lll
    // https://stackoverflow.com/questions/46639969/how-can-we-create-cross-region-rds-read-replica-using-aws-cloud-formation-templa
    // and: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-rds-dbinstance.html#cfn-rds-dbinstance-sourcedbinstanceidentifier
    // dbi.SetSourceDbInstanceIdentifier(jsii.String("arn:aws:rds:eu-central-1:860345245734:db:clcorefra-postgresinstance95a4e08e-v383fggzzyhn"))
    // dbi.SetSourceDbInstanceIdentifier(dbi.SourceDbInstanceIdentifier())
    dbi.SetSourceDbInstanceIdentifier(jsii.Sprintf(
        "arn:aws:rds:%s:%s:db:%s",
        *sourceStack.Region(),
        *sourceStack.Account(),
        *sourceInstance.InstanceIdentifier()))
pahud commented 5 months ago

Can you share your synthesized template for the DBInstance resource?

According to the doc:

If the source DB instance is in a different region than the read replica, specify the source region in SourceRegion, and specify an ARN for a valid DB instance in SourceDBInstanceIdentifier. For more information, see Constructing a Amazon RDS Amazon Resource Name (ARN) in the Amazon RDS User Guide.

I think you have to comply with the 2 requirements and looks like you might be missing SourceRegion.

github-actions[bot] commented 5 months ago

This issue has not received a response in a while. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.

advdv commented 5 months ago

In my reply I've illustrated how this is solved. I did NOT have to provide the SourceRegion. I did have to provide a full ARN to the source instance. The ARN includes the region so maybe it does some trickery to fill in the SourceRegion itself.

I cannot provide the full synthesised template for security reasons but I can provide certain attributes maybe. What do you wanna look at?

In short, I think the NewDatabaseInstanceReadReplica function should look at the the SourceInstance and determine through it's identifier if it's cross region or not. If it is, it should set SourceDbInstanceIdentifier to be the full ARN, if not it can use just the identifier.