cisagov / cyhy-system

Cyber Hygiene system and overall documentation/issue tracking
Creative Commons Zero v1.0 Universal
6 stars 0 forks source link

DNS lookups fail in Trustworthy Email scans #22

Open jsf9k opened 3 years ago

jsf9k commented 3 years ago

🐛 Summary

Some DNS lookups are failing in Trustworthy Email scans, presumably because we are hitting the 1024 queries per second per network interface throttling limit applied to AWS VPCs or another similar limit.

I noticed this in this week's BOD 18-01 scanning run, since the Trustworthy Email report for National Labor Relations Board failed to generate. When I investigated, I saw that this is because the Trustworthy Email reporting process thought there were no active domains for that agency; however, a quick dig MX nlrb.gov shows this not to be the case. Digging further, I can see many lines of this form in the file /var/cyhy/orchestrator/output/archive/latest/results/trustymail.csv:

nlrb.gov,nlrb.gov,False,False,,,,,,,,,,False,,,False,,,,False,,,,,,,,False,False,False,,[MX] In mx_scan at /var/task/trustymail/trustymail.py:95: All nameservers failed to answer the query nlrb.gov. IN MX: Server 169.254.78.1 TCP port 53 answered REFUSED,,2021-07-31T05:44:51.178821Z,2021-07-31T05:44:55.975408Z,4.796587,e044ef93-a14d-401f-981c-a9df21f3b2bb,/aws/lambda/task_trustymail,2021/07/31/[$LATEST]072fd2c0e8244d76ae42df66fcc48b74,2021-07-31T05:44:51.191Z,2021-07-31T05:44:55.971385Z,128,4.780385

Note the error message All nameservers failed to answer the query nlrb.gov. IN MX: Server 169.254.78.1 TCP port 53 answered REFUSED, indicating that the AWS DNS queries are being throttled. I thought DNS was at 169.254.169.253, but it appears that something different is happening with DNS for Lambdas live in a VPC: see here, here, and especially here. Note that the last link indicates pretty degraded performance for non-cacheable DNS requests in AWS Lambda.

This is exactly the problem that is discussed in this comment.

To reproduce

Steps to reproduce the behavior:

  1. Run grep -F REFUSED /var/cyhy/orchestrator/output/archive/latest/results/trustymail.csv on the BOD 18-01 reporting instance.

Expected behavior

The Trustworthy Email results should not include any DNS refusals.

jsf9k commented 3 years ago

I will get an email out to our AWS contacts tomorrow morning.

jsf9k commented 3 years ago

I will get an email out to our AWS contacts tomorrow morning.

I sent an email to Adrian on Friday.

jsf9k commented 3 years ago

Adrian got back to us on August 20:

I looked through the code in the cisgov/trustymail and 18F/domain-scan repos, along with your description. It’s a sticky wicket. Here are what I see as the challenges and path forward:

Challenges:

The tl;dr is what you already know:

Options: