aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.72k stars 3.94k forks source link

rds: cannot upgrade rds minor engine version with DatabaseInstanceReadReplica #26755

Open rafzei opened 1 year ago

rafzei commented 1 year ago

Describe the bug

I got an error when trying to upgrade a minor version of the RDS Postgres consisting of Primary and Replica instances.

6:27:42 PM | UPDATE_FAILED        | AWS::RDS::DBInstance                        | rds
One or more of the DB Instance's read replicas need to be upgraded: rds-read (Service: Rds, Status Code: 400, Request ID: xxx)

It looks like CDK is trying to upgrade only the Primary instance (or trying to do it first)

Expected Behavior

Automatically upgrade the Replicas along with the Primary with the right order or allow to specify engine version to aws_rds.DatabaseInstanceReadReplica.

rds = aws_rds.DatabaseInstance(
            ...
            engine=aws_rds.DatabaseInstanceEngine.postgres(version=RDS_VERSION),
            ...
)

rds_read = aws_rds.DatabaseInstanceReadReplica(
            ...
            source_database_instance=rds,
            engine=aws_rds.DatabaseInstanceEngine.postgres(version=RDS_VERSION),
            ...
_

Current Behavior

The error occurs.

Reproduction Steps

Prepare and RDS stack with min. one replica, deploy, increase a minor version and deploy again

RDS_VERSION = aws_rds.PostgresEngineVersion.VER_11_19

rds = aws_rds.DatabaseInstance(
            self,
            "rds",
            credentials=aws_rds.Credentials.from_generated_secret("postgres"),
            engine=aws_rds.DatabaseInstanceEngine.postgres(version=RDS_VERSION),
            instance_type=aws_ec2.InstanceType(instance_type),
            multi_az=True,
            vpc=vpc,
            vpc_subnets=aws_ec2.SubnetSelection(
                subnet_group_name=shared.SUBNETS_GENERAL_ISOLATED,
            ),
            publicly_accessible=False,
            security_groups=[rds_sg],
            storage_encrypted=True,
            allocated_storage=RDS_INITIAL_STORAGE,
            max_allocated_storage=RDS_MAX_STORAGE,
            performance_insight_retention=aws_rds.PerformanceInsightRetention.DEFAULT,
            cloudwatch_logs_exports=["postgresql", "upgrade"],
            backup_retention=Duration.days(RDS_BACKUP_RETENTION),
            preferred_maintenance_window="sat:08:10-sat:10:10",
            preferred_backup_window="03:10-04:10",
        )

        rds_read = aws_rds.DatabaseInstanceReadReplica(
            self,
            "rds-read",
            source_database_instance=rds,
            instance_type=aws_ec2.InstanceType(replica_instance_type),
            publicly_accessible=False,
            multi_az=False,
            storage_encrypted=True,
            vpc=vpc,
            vpc_subnets=aws_ec2.SubnetSelection(
                subnet_group_name=shared.SUBNETS_GENERAL_ISOLATED,
            ),
            security_groups=[rds_sg],
            performance_insight_retention=aws_rds.PerformanceInsightRetention.DEFAULT,
            max_allocated_storage=RDS_MAX_STORAGE,
            preferred_maintenance_window="sat:08:10-sat:10:10",
        )

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.91.0

Framework Version

No response

Node.js Version

18

OS

Ubuntu

Language

Python

Language Version

3.11.4

Other information

No response

pahud commented 1 year ago

According to the doc, the read replica has to be upgraded first before the primary and cloudformation just has no idea about that.

Minor version upgrades In contrast, minor version upgrades include only changes that are backward-compatible with existing applications. You can initiate a minor version upgrade manually by modifying your DB instance. Or you can enable the Auto minor version upgrade option when creating or modifying a DB instance. Doing so means that your DB instance is automatically upgraded after Amazon RDS tests and approves the new version. If your PostgreSQL DB instance is using read replicas, you must first upgrade all of the read replicas before upgrading the primary instance. If your DB instance is in a Multi-AZ deployment, then the writer and any standby replicas are upgraded simultaneously. Therefore, your DB instance might not be available until the upgrade is complete. For more details, see Automatic minor version upgrades for PostgreSQL. For information about manually performing a minor version upgrade, see Manually upgrading the engine version. https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_UpgradeDBInstance.PostgreSQL.html

You probably have two options here:

  1. Update your cdk code for the replica minor version. Deploy to upgrade the replica and then modify the primary minor version and re-deploy to upgrade the primary.

  2. Add dependency like primary.node.add_dependency(replica) and hopefully the replica will upgrade before the primary. Remove the dependency when upgrade is completed.

Please note I haven't tested any of the options above. Make sure you test them in your testing environment.

pahud commented 1 year ago

OK I ended up using the first option with escape hatches and it works for me.

class DemoStack(Stack):
    def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
        super().__init__(scope, construct_id, **kwargs)

        RDS_VERSION = rds.PostgresEngineVersion.VER_11_19        
        vpc = ec2.Vpc.from_lookup(self, 'Vpc', is_default=True);

        primary = rds.DatabaseInstance(
            self,
            "rds",
            credentials=rds.Credentials.from_generated_secret("postgres"),
            engine=rds.DatabaseInstanceEngine.postgres(version=RDS_VERSION),
            instance_type=ec2.InstanceType.of(ec2.InstanceClass.BURSTABLE3, ec2.InstanceSize.SMALL),
            multi_az=True,
            vpc=vpc,
            publicly_accessible=False,
        )

        replica = rds.DatabaseInstanceReadReplica(
            self,
            "rds-read",
            source_database_instance=primary,
            instance_type=ec2.InstanceType.of(ec2.InstanceClass.BURSTABLE3, ec2.InstanceSize.SMALL),
            publicly_accessible=False,
            multi_az=False,
            storage_encrypted=True,
            vpc=vpc,
        )

Add this in the bottom of above which overrides the engine and engine version of the replica:

# let's upgrade the read replica first.
cfninstance = replica.node.default_child
cfninstance.engine = 'postgres'
cfninstance.engine_version = '11.20'

check the synth output

$ npx cdk synth

Make sure you see this in the replica instance:

"Engine": "postgres",
    "EngineVersion": "11.20",
# make sure cdk diff only indicates the modify on the replica
$ npx cdk diff
# cdk deploy if everything looks great to you
$ npx cdk deploy

Now go to RDS console, a new replica will be created. Initially the new created replica will be version 11.19 and after that you will see it's upgrading to 11.20. And then the old replica of version 11.19 will be removed.

Your primary should remain 11.19 while the new replica 11.20.

Now update your primary engine version.

class DemoStack(Stack):
    def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
        super().__init__(scope, construct_id, **kwargs)

        RDS_VERSION = rds.PostgresEngineVersion.VER_11_20

        vpc = ec2.Vpc.from_lookup(self, 'Vpc', is_default=True);

        primary = rds.DatabaseInstance(
            self,
            "rds",
            credentials=rds.Credentials.from_generated_secret("postgres"),
            engine=rds.DatabaseInstanceEngine.postgres(version=RDS_VERSION),
            instance_type=ec2.InstanceType.of(ec2.InstanceClass.BURSTABLE3, ec2.InstanceSize.SMALL),
            multi_az=True,
            vpc=vpc,
            publicly_accessible=False,
        )

        replica = rds.DatabaseInstanceReadReplica(
            self,
            "rds-read",
            source_database_instance=primary,
            instance_type=ec2.InstanceType.of(ec2.InstanceClass.BURSTABLE3, ec2.InstanceSize.SMALL),
            publicly_accessible=False,
            multi_az=False,
            storage_encrypted=True,
            vpc=vpc,
        )

Run cdk diff again. Make sure only the EngineVersion changes on the primary. Run cdk deploy if it looks good for you.

Resources
[~] AWS::RDS::DBInstance rds rds3515897F 
 └─ [~] EngineVersion
     ├─ [-] 11.19
     └─ [+] 11.20

Your primary should be in Upgrading status.

Now your primary and replica should all at 11.20 now.

Last but not least, remove the escape hatches block.

# let's upgrade the read replica first.
cfninstance = replica.node.default_child
cfninstance.engine = 'postgres'
cfninstance.engine_version = '11.20'

cdk diff and cdk deploy again. This would create another new replica with 11.20 and replace the existing one.

You are all set!

Let me know if it works for you but please note I believe RDS encourages Automatic minor version upgrades for PostgreSQL whenever possible rather than manual upgrade. The approach above is just FYR and I only tested it once in my account, make sure you evaluate this in your testing environment if you really have to upgrade from CDK.

rafzei commented 1 year ago

Thank you @pahud for your time. Yes, changing the node property is a solution that works. The

cfninstance = replica.node.default_child
cfninstance.engine_version = '11.20'

is sufficient just to upgrade the replica without re/creating the new one. However, I believe it should be possible to keep the replica version in an L2 construct so I propose to change this issue as a Feature request and keep open.

pahud commented 1 year ago

@rafzei Do you mean to keep the Engine and EngineVersion properties for the replica? Makes sense to me. I am making it a p2 feature request and welcome community PRs for this.