aws-solutions / aws-data-lake-solution

A deployable reference implementation intended to address pain points around conceptualizing data lake architectures that automatically configures the core AWS services necessary to easily tag, search, share, and govern specific subsets of data across a business or with other external businesses.
https://aws.amazon.com/solutions/implementations/data-lake-solution/
Apache License 2.0
401 stars 160 forks source link

Add support for VPCOptions to Elasticsearch cluster deployment configuration #18

Open dave-malone opened 6 years ago

dave-malone commented 6 years ago

In order to support Private deployments of the Data lake solution, allow for the configuration of a VPC deployed Elasticsearch cluster. I'm willing to collaborate and contribute on this change request.

hvital commented 6 years ago

Thanks for your feedback!

We’re currently working to publish Active Directory integration and ES authentication via Cognito (https://aws.amazon.com/blogs/database/get-started-with-amazon-elasticsearch-service-use-amazon-cognito-for-kibana-access-control/).

As VPC support also requires to review all other components (ex: lambda, ES, dynamoDB ... ) and also give the option to create a new VPC or reuse an existing one, I’ll put this item in the solution’s backlog.

If you already have something that it’s ok (in terms of the solution’s license) to share/include to the repo, please send a PR.

jgc234 commented 5 years ago

If you adjust the CFN templates to put all the lambda functions (including the helper) into a VPC, and the ES cluster, it seems to work fine.

knihit commented 4 years ago

Hi @jgc234, unfortunately even if you put Lambda functions and ES into a VPC, it would still require a NAT and IGW to communicate with S3 and DynamoDB.

@dave-malone I will initiate this a feature request and try to plan this for any future release. At the same time you are welcome to submit any PR on this feature.