aws-solutions / workload-discovery-on-aws

Workload Discovery on AWS is a solution to visualize AWS Cloud workloads. With it you can build, customize, and share architecture diagrams of your workloads based on live data from AWS. The solution maintains an inventory of the AWS resources across your accounts and regions, mapping their relationships and displaying them in the user interface.
https://aws.amazon.com/solutions/implementations/workload-discovery-on-aws/
Apache License 2.0
727 stars 88 forks source link

Gremlin lambda can't DNS resolve the Neptune endpoint #554

Open mmigliari opened 1 month ago

mmigliari commented 1 month ago

Describe the bug The gremlin discovery lambda is unable to resolve the Neptune DNS endpoint, failing with a getaddrinfo EAI_AGAIN <endpoint_address> error.

To Reproduce Launch the stacks as per the documentation and wait for the ECS scheduled task to fire up the lambda. The errors can be seen on the lambda cloudwatch logs with a timeout and the getaddrinfo EAI_AGAIN <endpoint_address> error.

Expected behavior The lambda, which is inside the VPC, should be able to resolve to using the DNS servers defined in the VPC dhcp option set.

Additional context This may be necessary in VPC setups with non-standard DNS settings.

Solution Open outbound UDP port 53 (DNS resolution) access to the lambda for the VPC CIDR range for DHCP options sets with DNS servers hosted in the VPC

svozza commented 1 month ago

Thanks for raising this so we can track it! At the very least we should document this in the troubleshooting guide.

mmigliari commented 1 month ago

I made a PR to include an outbout rule on the gremlin lambda security group to allow for UDP port 53 access to the VPC CIDR range. This assunes any DNS servers in the VPC DHCP option set are set in the VPC CIDR range.

One alternative would be to ask for the DNS servers to be used, if they are not standard, during the CloudFormation template launch. If they are added, then just add outbound UDP port 53 access to those.