aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.54k stars 3.86k forks source link

(elasticsearch) instance type and ebs validation doesn't distinguish data and master node during domain creation. #11898

Closed AlvinMengCao closed 2 years ago

AlvinMengCao commented 3 years ago

When trying to create Elasticsearch domain with master node type c5.2xlarge.elasticsearch, data node type i3.2xlarge.elasticsearch, with ebs disabled, cdk throws exception says

throw new Error('EBS volumes are required when using instance types other than r3 or i3.');

If I enable ebs, it throws exception:

throw new Error('I3 instance types do not support EBS storage volumes.');

The version I am using is 1.76.

From API reference, the ebs suppose to be only attached to master node in my case, however the validation couldn't distinguish data node and master node. With my instance config, it thinks c5 should have ebs, but i3 shouldn't.

c5 is recommended for master node, and i3 for data node. With this bug, there is no way to use CDK to create Elasticsearch domain with master nodes.

iliapolo commented 3 years ago

@AlvinMengCao Its true the code doesn't distinguish between master and data nodes, but I can't seem to find a distinction in the AWS docs as well.

Can you share the API reference you mentioned?

Also, I disabled those validations locally and tested the deployment. When I configure an i3 instance with EBS, I get the following error:

    new elastic.Domain(this, 'Domain', {
      version: elastic.ElasticsearchVersion.V7_7,
      capacity: {
        masterNodes: 3,
        masterNodeInstanceType: 'c5.2xlarge.elasticsearch',
        dataNodeInstanceType: 'i3.2xlarge.elasticsearch'
      }
    });
Issue11898: deploying...
Issue11898: creating CloudFormation changeset...
5:05:11 PM | CREATE_FAILED        | AWS::Elasticsearch::Domain | Domain66AC69E0
Instance type i3.2xlarge.elasticsearch does not support EBS storage (Service: AWSElasticsearch; Status Code: 400; Error Code: ValidationException; Request ID: 2ac4b900-b51b-4614-a0c1-8c69eb882564; Proxy: null)

When I try to use c5 without EBS, I get the following error:

    new elastic.Domain(this, 'Domain', {
      version: elastic.ElasticsearchVersion.V7_7,
      capacity: {
        masterNodes: 3,
        masterNodeInstanceType: 'c5.2xlarge.elasticsearch',
        dataNodeInstanceType: 'c5.xlarge.elasticsearch'
      },
      ebs: {
        enabled: false,
      }
    });
Issue11898: deploying...
Issue11898: creating CloudFormation changeset...
6:10:18 PM | CREATE_FAILED        | AWS::Elasticsearch::Domain | Domain66AC69E0
EBS storage must be selected for c5.xlarge.elasticsearch (Service: AWSElasticsearch; Status Code: 400; Error Code: ValidationException; Request ID: 30a31c0e-6a3b-45b3-8683-4b714b7d4ec0; Proxy: null)

So it seems that our current validations are behaving as expected.

Can you also share the exact code that configures your domain?

Thanks

AlvinMengCao commented 3 years ago

Hi, @iliapolo

Before using CDK, I used CLI, with the following config:

{
  "AccessPolicies": "",
  "AdvancedOptions":{
    "rest.action.multi.allow_explicit_index":"true"
  },
  "DomainName":"foo-bar-dummy-name-hello-world",
  "DomainEndpointOptions": {
    "EnforceHTTPS" : true,
    "TLSSecurityPolicy" : "Policy-Min-TLS-1-2-2019-07"
  },
  "ElasticsearchVersion": "7.8",
  "ElasticsearchClusterConfig":{
    "DedicatedMasterEnabled":true,
    "InstanceCount":3,
    "ZoneAwarenessEnabled":false,
    "InstanceType":"i3.large.elasticsearch",
    "DedicatedMasterType":"c5.large.elasticsearch",
    "DedicatedMasterCount":3
  },
  "NodeToNodeEncryptionOptions": {
    "Enabled": true
  },
  "SnapshotOptions":{
    "AutomatedSnapshotStartHour":0
  }
}

From the output you pasted, it seems like the same validation exist on CFN side as well. If this is the case, then there is nothing we can do from CDK.

iliapolo commented 3 years ago

@AlvinMengCao Im not sure its CloudFormation, since the validation might be coming from the service api itself. Also when creating the domain from the aws console, the same restrictions apply.

Are you able to create a domain with this configuration from either the CLI or the console now?

AlvinMengCao commented 3 years ago

@iliapolo Yes, just tried with the same config, and here is the output.

{
    "DomainStatus": {
        "DomainId": "????????????/test-domain",
        "DomainName": "test-domain",
        "ARN": "arn:aws:es:us-east-2:"????????????/:domain/test-domain",
        "Created": true,
        "Deleted": false,
        "Processing": true,
        "UpgradeProcessing": false,
        "ElasticsearchVersion": "7.8",
        "ElasticsearchClusterConfig": {
            "InstanceType": "i3.2xlarge.elasticsearch",
            "InstanceCount": 15,
            "DedicatedMasterEnabled": true,
            "ZoneAwarenessEnabled": false,
            "DedicatedMasterType": "c5.2xlarge.elasticsearch",
            "DedicatedMasterCount": 3,
            "WarmEnabled": false
        },
        "EBSOptions": {
            "EBSEnabled": false
        },
        "AccessPolicies": "",
        "SnapshotOptions": {
            "AutomatedSnapshotStartHour": 0
        },
        "CognitoOptions": {
            "Enabled": false
        },
        "EncryptionAtRestOptions": {
            "Enabled": false
        },
        "NodeToNodeEncryptionOptions": {
            "Enabled": true
        },
        "AdvancedOptions": {
            "rest.action.multi.allow_explicit_index": "true"
        },
        "ServiceSoftwareOptions": {
            "CurrentVersion": "",
            "NewVersion": "",
            "UpdateAvailable": false,
            "Cancellable": false,
            "UpdateStatus": "COMPLETED",
            "Description": "There is no software update available for this domain.",
            "AutomatedUpdateDate": "1969-12-31T19:00:00-05:00",
            "OptionalDeployment": true
        },
        "DomainEndpointOptions": {
            "EnforceHTTPS": true,
            "TLSSecurityPolicy": "Policy-Min-TLS-1-2-2019-07"
        },
        "AdvancedSecurityOptions": {
            "Enabled": false,
            "InternalUserDatabaseEnabled": false
        }
    }
}
iliapolo commented 3 years ago

@AlvinMengCao Thanks.

You're right, seems like the console does make this distinction and your configuration is valid:

Screen Shot 2020-12-15 at 6 53 27 PM

We will look into it, might indeed be a CloudFormation only validation.

Thanks for reporting!

smashedtoatoms commented 3 years ago

Anyone have workaround ideas for this? I'd really like to be able to use i3.

AlvinMengCao commented 3 years ago

Our workaround is to take the ES creation out of CDK. ES created by CLI doesn't have this issue.

If your ES doesn't need any vpc configuration, then you can probably also explore the CFN customized resource. I didn't use it before, but I heard a customized resource allow you to write a script to tell CFN how to create the resource. Upon resource creating, CFN just execute the script.

DoronGanel commented 3 years ago

Any expectation when a fix will be released?

iliapolo commented 3 years ago

Hi. This isn't currently prioritized for our short-term plans.

Actually, since the validation happens at construction time, you can bypass it and workaround the issue like so:

const domain = new es.Domain(this, 'Domain', {
  version: es.ElasticsearchVersion.V7_9,
  capacity: {
    masterNodes: 3,
    masterNodeInstanceType: 'c5.large.elasticsearch',
    // dataNodeInstanceType: 'i3.2xlarge.elasticsearch' // comment it out to satisfy the constructor
  }
});

// bring it back after the validation passed.
const cfnDomain = domain.node.defaultChild as es.CfnDomain;
cfnDomain.elasticsearchClusterConfig = {
  ...cfnDomain.elasticsearchClusterConfig,
  instanceType: 'i3.2xlarge.elasticsearch'
}
jamiepeloquin commented 3 years ago

If there is a way to +1 this issue, I would like to add mine. We are using i3 for data nodes, and m5 for masters. The current ES Domain construct does not allow for this mix – and unfortunately it is one we already have in production. This was originally created without issue using the cfnDomain construct, so I am not at all sure how this construct missed that you can define data and master nodes separately. @iliapolo Thank you for your workaround, but this bug has now been around for several months, is there any update?

jamiepeloquin commented 3 years ago

Follow-up. After trying @iliapolo ’s suggestion (code below), I am continuing to get the following error from CloudFormation:

Instance type i3.large.elasticsearch does not support EBS storage (Service: AWSElasticsearch; Status Code: 400; Error Code: ValidationException; Request ID: 0dad9072-8baa-4390-a519-0c3233656a56; Proxy: null)

const esDomain = new Domain(this, "ElasticsearchCluster", {
    advancedOptions: {rest.action.multi.allow_explicit_index: "true"},
    capacity: {
        dataNodes: 4,
        masterNodeInstanceType: "m5.xlarge.elasticsearch",
        masterNodes: 3
    },
    ebs: {enabled: true},
    enableVersionUpgrade: true,
    encryptionAtRest: {enabled: true},
    enforceHttps: true,
    version: ElasticsearchVersion.V7.7
})
nom3ad commented 2 years ago

Workaround mentioned in https://github.com/aws/aws-cdk/issues/11898#issuecomment-800102733 wasn't working when ebs: { enabled: false, }, is set.

Following is a valid setup but fails to work.

new es.Domain(stack, 'Domain', {
  version: es.ElasticsearchVersion.V7_7,
  ebs: {
    enabled: false,
  },
  capacity: {
    masterNodes: 3,
    masterNodeInstanceType: 'c6g.large.elasticsearch',
    dataNodeInstanceType: 'r6gd.large.elasticsearch'
  }
});

So I ended up using this workaround.

// TODO(mj): move to `opensearchservice` module
class PatchedElasticSearchDomain extends elasticsearch.Domain {
    constructor(scope: cdk.Construct, id: string, props: elasticsearch.DomainProps) {
        if (props.ebs?.enabled === false && props.capacity?.masterNodeInstanceType && !['r3', 'i3', 'r6gd'].some(t => props.capacity?.masterNodeInstanceType?.startsWith(t))) {
            // XXXX: Due to the bug - https://github.com/aws/aws-cdk/issues/11898, following error would be thrown from constructor.
            //    """Error: EBS volumes are required when using instance types other than r3, i3 or r6gd."""
            // Source: https://github.com/aws/aws-cdk/blob/7b64abf51c52cd2f6f585d7fd9201030fdba8163/packages/%40aws-cdk/aws-elasticsearch/lib/domain.ts#L1472
            // When EBS is not enabled, CDK wrongly expects master nodes to be of an instance type having builtin storage, ie one of r3,i3,r6gd types. But I think such checks only required for data nodes. Need to look up some docs.
            super(scope, id, { ...props, capacity: { ...props.capacity, masterNodeInstanceType: 'i3xxxx.elasticsearch' } }); // just to make validation pass in base class constructor
            cdk.Annotations.of(this).addWarning("ElasticSearch construct was patched to work around bug: https://github.com/aws/aws-cdk/issues/11898");
            // Put back correct master configuration
            Object.assign((this.node.defaultChild as elasticsearch.CfnDomain).elasticsearchClusterConfig, { dedicatedMasterType: props.capacity.masterNodeInstanceType });
            return;
        }
        super(scope, id, props);
    }
}
github-actions[bot] commented 2 years ago

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.