pulumi / pulumi-aws

An Amazon Web Services (AWS) Pulumi resource package, providing multi-language access to AWS
Apache License 2.0
459 stars 155 forks source link

DMS ReplicationTaskSettings bug from upstream subtly worse in bridged provider #3867

Closed oeuftete closed 5 months ago

oeuftete commented 5 months ago

What happened?

We updated to the latest AWS provider, and our DMS updates started hitting this:

  aws:dms:ReplicationTask (foo-dms):
    error: aws:dms/replicationTask:ReplicationTask resource 'foo-dms' has a problem: Invalid value. The parameter Logging.CloudWatchLogGroup is read-only and cannot be set.. Examine values at 'foo-dms.replicationTaskSettings'.

There is an upstream bug: https://github.com/hashicorp/terraform-provider-aws/issues/36997

I understand that a bug in the upstream is a bug here, and that's unavoidable. However -- and this is why I'm opening this -- the bug is worse here. The terraform issue appeared to only happen for users with explicitly set null values, exacerbated by AWS's docs possibly showing these to be set.

In the Pulumi provider, this error occurs without the values being explicitly set. AFAICT, there is no workaround besides downgrading to a version without the bug, such as v6.31.1 or earlier.

If the upstream decides to not "fix" this, i.e. they say the solution is "don't set these attributes", then this would need some sort of fix on this side of the provider?

Example

A dms.NewReplicationTask with ReplicationTaskSettings that does not include any value for Logging.CloudWatchLogGroup, created with a plugin version prior to v6.32.0.

Output of pulumi about

Details

```text CLI Version 3.113.3 Go Version go1.22.2 Go Compiler gc Plugins KIND NAME VERSION resource aws 6.32.0 resource aws-native 0.102.0 resource cloudflare 5.26.0 resource command 0.9.2 resource datadog 4.27.0 resource github 6.2.0 language go unknown resource kubernetes 4.11.0 resource postgresql 3.11.0 resource random 4.16.1 resource snowflake 0.52.0 resource std 1.6.2 Host OS darwin Version 14.4.1 Arch x86_64 This project is written in go: executable='/Users/ken/.asdf/shims/go' version='go version go1.22.2 darwin/amd64' Dependencies: NAME VERSION github.com/DATA-DOG/go-sqlmock v1.5.2 github.com/DataDog/datadog-api-client-go/v2 v2.25.0 github.com/aws/aws-sdk-go v1.51.25 github.com/aws/aws-sdk-go-v2/config v1.27.11 github.com/aws/aws-sdk-go-v2/service/secretsmanager v1.28.6 github.com/go-faster/errors v0.7.1 github.com/go-faster/jx v1.1.0 github.com/lib/pq v1.10.9 github.com/ogen-go/ogen v1.1.0 github.com/pulumi/pulumi-aws-native/sdk v0.102.0 github.com/pulumi/pulumi-aws/sdk/v6 v6.32.0 github.com/pulumi/pulumi-cloudflare/sdk/v5 v5.26.0 github.com/pulumi/pulumi-command/sdk v0.9.2 github.com/pulumi/pulumi-datadog/sdk/v4 v4.27.0 github.com/pulumi/pulumi-github/sdk/v6 v6.2.0 github.com/pulumi/pulumi-kubernetes/sdk/v4 v4.11.0 github.com/pulumi/pulumi-postgresql/sdk/v3 v3.11.0 github.com/pulumi/pulumi-random/sdk/v4 v4.16.1 github.com/pulumi/pulumi-snowflake/sdk v0.52.0 github.com/pulumi/pulumi-std/sdk v1.6.2 github.com/pulumi/pulumi/sdk/v3 v3.113.3 github.com/r3labs/diff/v3 v3.0.1 github.com/rs/zerolog v1.32.0 github.com/stretchr/testify v1.9.0 go.opentelemetry.io/otel v1.25.0 go.opentelemetry.io/otel/metric v1.25.0 go.opentelemetry.io/otel/trace v1.25.0 go.uber.org/multierr v1.11.0 golang.org/x/exp v0.0.0-20240416160154-fe59bbe5cc7f gopkg.in/yaml.v3 v3.0.1 gorm.io/driver/postgres v1.5.7 gorm.io/gorm v1.25.9 ```

Additional context

Potentially there's an opportunity here to improve the bridging process?

Contributing

Vote on this issue by adding a 👍 reaction. To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).

corymhall commented 5 months ago

@oeuftete thanks for creating this ticket! I'll take a look today and see what we can do. If we can't come up with a workaround then we may have to patch upstream to fix it.

corymhall commented 5 months ago

@oeuftete do you happen to have an example program that we can use to reproduce?

corymhall commented 5 months ago

@oeuftete I have tried to reproduce this issue with the below application and I do not get any error message.

import * as aws from "@pulumi/aws";
import { ServicePrincipal } from '@pulumi/aws/iam/documents';

const dmsPrincipal: ServicePrincipal = {
  Service: 'dms.amazonaws.com'
};

const vpc = new aws.ec2.Vpc("Vpc", {
  cidrBlock: "10.0.0.0/16",
});

const subnet1 = new aws.ec2.Subnet('Subnet1', {
  vpcId: vpc.id,
  cidrBlock: '10.0.160.0/20',
  availabilityZone: 'us-east-2b',
})
const subnet2 = new aws.ec2.Subnet('Subnet2', {
  vpcId: vpc.id,
  cidrBlock: '10.0.96.0/20',
  availabilityZone: 'us-east-2a',
})

const role = new aws.iam.Role('dms-vpc-role', {
  name: 'dms-vpc-role',
  assumeRolePolicy: aws.iam.assumeRolePolicyForPrincipal(dmsPrincipal),
  managedPolicyArns: ['arn:aws:iam::aws:policy/service-role/AmazonDMSVPCManagementRole'],
});
const subnetGroup = new aws.dms.ReplicationSubnetGroup('subnet-group', {
  subnetIds: [subnet1.id, subnet2.id],
  replicationSubnetGroupId: 'subnet-group',
  replicationSubnetGroupDescription: 'replication group',
});

const instance = new aws.dms.ReplicationInstance('instance', {
  replicationInstanceId: 'replication-instance',
  replicationInstanceClass: 'dms.t2.micro',
  replicationSubnetGroupId: subnetGroup.id,
  allocatedStorage: 5,
  autoMinorVersionUpgrade: true,
}, { dependsOn: [role]});

const source = new aws.dms.Endpoint('source', {
  endpointId: 'source',
  engineName: 'aurora',
  password: 'test',
  username: 'chall',
  serverName: 'test',
  sslMode: 'none',
  port: 3306,
  endpointType: 'source',
});
const target = new aws.dms.Endpoint('target', {
  endpointId: 'target',
  engineName: 'aurora',
  username: 'chall',
  serverName: 'test',
  sslMode: 'none',
  port: 3306,
  password: 'test',
  endpointType: 'target',
});

const test = new aws.dms.ReplicationTask("test", {
  replicationTaskId: 'replication-task',
  replicationInstanceArn: instance.replicationInstanceArn,
  migrationType: "full-load",
  sourceEndpointArn: source.endpointArn,
  replicationTaskSettings: JSON.stringify(
    {
      "Logging": {
        "EnableLogging": true,
        "LogComponents": [
          {
            "Severity": "LOGGER_SEVERITY_DEFAULT",
            "Id": "DATA_STRUCTURE"
          },
          {
            "Severity": "LOGGER_SEVERITY_DEFAULT",
            "Id": "COMMUNICATION"
          },
          {
            "Severity": "LOGGER_SEVERITY_DEFAULT",
            "Id": "IO"
          },
          {
            "Severity": "LOGGER_SEVERITY_DEFAULT",
            "Id": "COMMON"
          },
          {
            "Severity": "LOGGER_SEVERITY_DEFAULT",
            "Id": "FILE_FACTORY"
          },
          {
            "Severity": "LOGGER_SEVERITY_DEFAULT",
            "Id": "FILE_TRANSFER"
          },
          {
            "Severity": "LOGGER_SEVERITY_DEFAULT",
            "Id": "REST_SERVER"
          },
          {
            "Severity": "LOGGER_SEVERITY_DEFAULT",
            "Id": "ADDONS"
          },
          {
            "Severity": "LOGGER_SEVERITY_DEFAULT",
            "Id": "TARGET_LOAD"
          },
          {
            "Severity": "LOGGER_SEVERITY_DEFAULT",
            "Id": "TARGET_APPLY"
          },
          {
            "Severity": "LOGGER_SEVERITY_DEFAULT",
            "Id": "SOURCE_UNLOAD"
          },
          {
            "Severity": "LOGGER_SEVERITY_DEFAULT",
            "Id": "SOURCE_CAPTURE"
          },
          {
            "Severity": "LOGGER_SEVERITY_DEFAULT",
            "Id": "TRANSFORMATION"
          },
          {
            "Severity": "LOGGER_SEVERITY_DEFAULT",
            "Id": "SORTER"
          },
          {
            "Severity": "LOGGER_SEVERITY_DEFAULT",
            "Id": "TASK_MANAGER"
          },
          {
            "Severity": "LOGGER_SEVERITY_DEFAULT",
            "Id": "TABLES_MANAGER"
          },
          {
            "Severity": "LOGGER_SEVERITY_DEFAULT",
            "Id": "METADATA_MANAGER"
          },
          {
            "Severity": "LOGGER_SEVERITY_DEFAULT",
            "Id": "PERFORMANCE"
          },
          {
            "Severity": "LOGGER_SEVERITY_DEFAULT",
            "Id": "VALIDATOR_EXT"
          }
        ],
      },
      "StreamBufferSettings": {
        "StreamBufferCount": 3,
        "CtrlStreamBufferSizeInMB": 5,
        "StreamBufferSizeInMB": 8
      },
      "ErrorBehavior": {
        "FailOnNoTablesCaptured": true,
        "ApplyErrorUpdatePolicy": "LOG_ERROR",
        "FailOnTransactionConsistencyBreached": false,
        "RecoverableErrorThrottlingMax": 1800,
        "DataErrorEscalationPolicy": "SUSPEND_TABLE",
        "ApplyErrorEscalationCount": 0,
        "RecoverableErrorStopRetryAfterThrottlingMax": true,
        "RecoverableErrorThrottling": true,
        "ApplyErrorFailOnTruncationDdl": false,
        "DataTruncationErrorPolicy": "LOG_ERROR",
        "ApplyErrorInsertPolicy": "LOG_ERROR",
        "EventErrorPolicy": "IGNORE",
        "ApplyErrorEscalationPolicy": "LOG_ERROR",
        "RecoverableErrorCount": -1,
        "DataErrorEscalationCount": 0,
        "TableErrorEscalationPolicy": "STOP_TASK",
        "RecoverableErrorInterval": 5,
        "ApplyErrorDeletePolicy": "IGNORE_RECORD",
        "TableErrorEscalationCount": 0,
        "FullLoadIgnoreConflicts": true,
        "DataErrorPolicy": "LOG_ERROR",
        "TableErrorPolicy": "SUSPEND_TABLE"
      },
      "TTSettings": null,
      "FullLoadSettings": {
        "CommitRate": 10000,
        "StopTaskCachedChangesApplied": false,
        "StopTaskCachedChangesNotApplied": false,
        "MaxFullLoadSubTasks": 8,
        "TransactionConsistencyTimeout": 600,
        "CreatePkAfterFullLoad": false,
        "TargetTablePrepMode": "DROP_AND_CREATE"
      },
      "TargetMetadata": {
        "ParallelApplyBufferSize": 0,
        "ParallelApplyQueuesPerThread": 0,
        "ParallelApplyThreads": 0,
        "TargetSchema": "",
        "InlineLobMaxSize": 0,
        "ParallelLoadQueuesPerThread": 0,
        "SupportLobs": true,
        "LobChunkSize": 64,
        "TaskRecoveryTableEnabled": false,
        "ParallelLoadThreads": 0,
        "LobMaxSize": 32,
        "BatchApplyEnabled": false,
        "FullLobMode": false,
        "LimitedSizeLobMode": true,
        "LoadMaxFileSize": 0,
        "ParallelLoadBufferSize": 0
      },
      "BeforeImageSettings": null,
      "ControlTablesSettings": {
        "historyTimeslotInMinutes": 5,
        "HistoryTimeslotInMinutes": 5,
        "StatusTableEnabled": false,
        "SuspendedTablesTableEnabled": false,
        "HistoryTableEnabled": false,
        "ControlSchema": "",
        "FullLoadExceptionTableEnabled": false
      },
      "LoopbackPreventionSettings": null,
      "CharacterSetSettings": null,
      "FailTaskWhenCleanTaskResourceFailed": false,
      "ChangeProcessingTuning": {
        "StatementCacheSize": 50,
        "CommitTimeout": 1,
        "BatchApplyPreserveTransaction": true,
        "BatchApplyTimeoutMin": 1,
        "BatchSplitSize": 0,
        "BatchApplyTimeoutMax": 30,
        "MinTransactionSize": 1000,
        "MemoryKeepTime": 60,
        "BatchApplyMemoryLimit": 500,
        "MemoryLimitTotal": 1024
      },
      "ChangeProcessingDdlHandlingPolicy": {
        "HandleSourceTableDropped": true,
        "HandleSourceTableTruncated": true,
        "HandleSourceTableAltered": true
      },
      "PostProcessingRules": null
    }
  ),
  tableMappings: "{\"rules\":[{\"rule-type\":\"selection\",\"rule-id\":\"1\",\"rule-name\":\"1\",\"object-locator\":{\"schema-name\":\"%\",\"table-name\":\"%\"},\"rule-action\":\"include\"}]}",
  targetEndpointArn: target.endpointArn,
});
oeuftete commented 5 months ago

@corymhall Thanks for this, and sorry about not providing a clean reproduction case. I'll try to extract the relevant part from our application to give a better demo. We are using the Go SDK, btw, not sure if that might be the crucial difference.

oeuftete commented 5 months ago

@corymhall Thanks for your investigation, and apologies again for not providing a clear reproduction case. In fact, I'm now struggling to re-reproduce this cleanly, but I did see it again in an environment using the Automation API with go; my go.mod had the plugin at 6.31.1 and I was explicitly installing the plugin as 6.31.1, which was very weird, since this check is only in 6.32.0+. When I cleaned my go package cache -- which did have 6.32.0 downloaded -- the problem went away.

When I start cleanly with any of the involved versions (6.31.1, 6.32.0, 6.33.0), I don't see the problem. I tried some combinations of upgrading and downgrading to try to reproduce the earlier broken environment.

I've since updated our Automation API project to explicitly remove any plugins from the workspace that we don't ask for explicitly by version, possibly that will protect development environments from this going forward, but I still don't fully understand how this could happen.

In any case, this does not seem to be a Pulumi problem after all, unless it's Pulumi doing something subtly wrong when there are multiple package versions available in an environment. Hopefully this comment helps anyone else with a similarly polluted local environment.

mjeffryes commented 5 months ago

Thanks for following up @oeuftete. I will close this for now, since we're not sure there's a bug here yet, but if you do find a way to repro this with some regularity, feel free to reopen!