blechschmidt / massdns

A high-performance DNS stub resolver for bulk lookups and reconnaissance (subdomain enumeration)
GNU General Public License v3.0
3.05k stars 456 forks source link

Surprising high number of lost lookups: about 20% false negatives #117

Open ghost opened 3 years ago

ghost commented 3 years ago

Hey there,

I noticed quite a lot of lost lookups with massdns. In the example below, it roughly misses about 1900 names out of 10k domains.

I am not sure what these false negatives come from though, apologies for not trying to find where the bug is in massdns.

I made a screen recording https://asciinema.org/a/415235 of the reproduction steps below:

$ docker run --rm -it --entrypoint sh blechschmidt/massdns
/massdns # apk --no-cache add git alpine-sdk curl ipython py3-dnspython
/massdns # git pull
/massdns # make
/massdns # cat > /resolvers.txt <<EOF # See https://gist.github.com/seb-elttam/af28008b092eb5bcdfede8565c55147e#file-mtresolver-py-L18
1.1.1.1
1.0.0.1
8.8.8.8
8.8.4.4
9.9.9.10
149.112.112.10
94.140.14.140
94.140.14.141
64.6.64.6
64.6.65.6
77.88.8.8
77.88.8.1
74.82.42.42
EOF

/massdns # curl -s http://s3.amazonaws.com/alexa-static/top-1m.csv.zip | unzip -p - | cut -d, -f2- \
 | head -n 10k  > /domains.txt

/massdns # time ./bin/massdns -s 10000 -q -r /resolvers.txt -o S -w /out.txt /domains.txt; \
 cat /out.txt | grep ' A [0-9]' | cut -d' ' -f1 | sort -u | wc -l
real    0m 7.07s
user    0m 0.09s
sys     0m 0.21s
7534

/massdns # time ./bin/massdns -s 1000 -q -r /resolvers.txt -o S -w /out.txt /domains.txt; \
 cat /out.txt | grep ' A [0-9]' | cut -d' ' -f1 | sort -u | wc -l
real    0m 5.03s
user    0m 0.10s
sys 0m 0.18s
7911

/massdns # git clone https://gist.github.com/seb-elttam/af28008b092eb5bcdfede8565c55147e /gist && cd /gist
/gist # ipython
Python 3.9.5 (default, May 12 2021, 20:44:22) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.23.1 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from mtresolver import *
   ...: r = resolve_hostnames('/domains.txt', 1000)
   ...: len(r)
06:38:03 DEBUG enter
06:38:21 DEBUG exit
Out[1]: 9847
In [2]: from mtresolver import *
   ...: r = resolve_hostnames('/domains.txt', 10000)
   ...: len(r)
06:39:07 DEBUG enter
06:39:25 DEBUG exit
Out[2]: 9850
youradds commented 2 years ago

I get quite a lot as well. I ended up writing my script to go through the list of domains multiple times. If after 10 times it still hasn't got an IP, then chances are its dead. Not very efficient though :(

ko2sec commented 2 years ago

I'm having same problem and inconsistency between scans as well on my digitalocean VPS.

blechschmidt commented 2 years ago

To debug the issue, I suggest the following:

  1. Clone the latest massdns version. 2b394082ea8b45b850718861185194920604e49d fixes an issue, though it is a minor one and only affects mixed resolver lists. 352187ce86b1ffa4038057a77460f4f7473ec038 changes the default response codes for which to retry queries.
  2. Use the -o Je output option and --error-log /tmp/error.log. This will log all input as well as output failures.
  3. The number of lines inside the MassDNS NDJSON output and the number of lines returned by grep -E '^Illegal|^Duplicate' /tmp/error.log should add up exactly to the number of supplied input domains. If they don't, there is a bug in MassDNS.
  4. Run jq '. | select(.error != null)' on the MassDNS output. This will show all output failures failures (e.g. due to timeouts or when the last packet received has an unacceptable return code). In case you see many TIMEOUT and MAXRETRIES errors, you hit network congestion, resolver rate limits or both.

In addition, I suggest performing reconnaissance scans for single domains against authoritative nameservers without leveraging third-party resolvers directly like so: ./bin/massdns -r <(./scripts/auth-addrs.sh example.com) --norecurse -o Je --error-log /tmp/error.log /tmp/names.txt