Open izhekov opened 2 years ago
Been perform scan multiple times on huge domain list because of this. End up using WARP VPN to solve, however it's also still misses couple endpoints on some domain.
There was another ticket on the topic #221 but since I'm unable to reopen I had to create a new one. Simply said, by running zgrab2 with a large list of domains, seems to result in overloading the DNS server, while the server under which I'm running zgrab2 is having ulimit set to unlimited.
Which value are you setting to unlimited? Just to confirm, the relevant setting is 'nofile'
FWIW, I previously had an issue using 'unlimited' in 'limits.conf' for the soft and hard limits for 'nofile', despite 'unlimited' working for all of the other rlimit settings. I ended up using a large integer value instead. 2 << 18 (256k) should be plenty
this link may be helpful in trying this out. Ensure you have the 'nofile' set to a large integer in 'sysctl', 'limits.conf', and your session and then try again. You'll meed to log out and back in again depending on where the change was necessary. If it's only a 'sysctl' issue you won't need to logout, it will be instant. For 'limits.conf'/PAM changes, you'll need a new session
Related to this, I've also encountered (on some systems) an issue where 'systemd' was effectively undoing 'limits.conf'. I had to make changes as described here
{"domain":"asnpart.net","data":{"http":{"status":"connection-timeout","protocol":"http","result":{},"timestamp":"2022-01-29T22:44:12+01:00","error":"dial tcp: lookup asnpart.net on 185.12.64.1:53: dial udp 185.12.64.1:53: socket: too many open files"}}}
Alternately, you can approach this differently. It may add value to your data as well, particularly because you're using the HTTP module
I don't delegate DNS resolution to 'zgrab2'- except where necessary, specifically where the HTTP module encounters a 301/302 to an FQDN, which require a runtime lookup
I prefer that DNS be done separately, to avoid additional cycles and network traffic during the zgrab2 tx loop. It also has the effect of reducing number of FDs used
What I do specifically is pre-resolve all of the FQDNs and then use the CSV-style targets format supported by 'zgrab2':
...
<fqdn>, <ip>, <tag/trigger>
...
Doing it this way has benefits in terms of comprehensiveness, especially for the HTTP module, which uses the FQDN from the target line for both the SNI server name value (if target is using TLS) as well as the HTTP Host header
For example, by pre-resolving each FQDN and generating a comprehensive targets CSV for each A response with a script (I use 'massdns' and a small Python script) you can ensure 'zgrab2' probes ALL of the logical targets (IP addresses) for those FQDNs that have multiple A record values. It also ensures that the target is probed with an invalid/default virtual host (Host header) value- in this case, the IP address
This allows you to identify things like inconsistencies in DNS load-balanced sites as well as issues associated with poorly configured name-based virtual hosting configurations
In the case of name-based virtual hosting (where the request may be routed differently based on the SNI name as well as the HTTP Host header) you will likely come across something "interesting" on a non-negligible number of endpoints when using the IP for the FQDN (a common one in my experience is administrative consoles) as opposed to the expected web property you will see when using the proper FQDN value
Here's an example of the targets CSV I produce which uses pre-resolved data as the input. This provides the perks mentioned above and also happens to work around your issue, since the name/IP are already specified
This example is for a single FQDN with two unique IP addresses in an A lookup. You'll see it generates 4 requests, rather than just 1, to get comprehensive coverage:
bah.com, 1.2.3.3, http80
bah.com, 1.2.3.4, http80
1.2.3.3, 1.2.3.3, http80
1.2.3.4, 1.2.3.4, http80
I realize this isn't necessarily a direct fix for your issue, but I figured it's an accidental workaround that (IMO) enhances the comprehensiveness of your output.
The drawback to this approach is the increase in total number of requests, the magnitude of which will depend entirely on your targets
If you're only concerned with reducing DNS lookups, then you could use the following:
bah.com, 1.2.3.4, http80
Some of this you may already be familiar with, but hopefully something here was helpful. Sorry for typos, this was written on mobile :>
I'm interested on your workaround involving massdns
could you assist me with some sample? I've been using default STEWS script but it seems to bloat the DNS Queries and misses most of the endpoint from a huge domain list. @mzpqnxow
I can at least point you in the right direction to save you some time. Unfortunately what I have to do this is very tightly integrated into a project I'm not able to share, but I'll see what I can provide that may be helpful. In a nutshell it's a few steps and assumes your only input is a list of DNS names. It depends on massdns, jq, masscan, zgrab2, groupcidr and python3 with pandas
If all you really need to know is the format of the target list and ini files, that's easy and can be found below. I tried to briefly outline the end to end process, though...
The summary
groupcidr
-oD out.ndjson
, or if converting from -oB
format, --readscan out.bin -oD out.ndjson
)port<port>
. The end result of this is a set of file pairs, one pair for each port- a target file and ini file. example belowthis is nothing special, just make sure it's emitting ndjson
#!/bin/bash
declare -r RESOLVERS_PATH=~/resolvers.lst
declare -r TIMESTAMP=$(date +%Y%m%d.%S)
declare -r infile="$1"
declare -r outpath="${infile}.out.$TIMESTAMP"
declare -r errfile="${outpath}/errors"
declare -r outfile="${outpath}/resolved"
declare -r statsfile="${outpath}/stats"
declare -r socket_count=2
declare -r hashmap_size=4000
declare -r interval=250
declare -r processes=1
declare -r resolve_count=30
...
massdns \
--interval $interval \
--retry REFUSED \
--retry SERVFAIL \
--hashmap-size $hashmap_size \
--socket-count $socket_count \
--error-log "$errfile" \
--resolve-count $resolve_count \
-o J \
-w "$outfile" \
-r "$RESOLVERS_PATH" \
$infile
$ cat targets/http-retry-https-8080.csv
1.2.3.4, bah.com, http:port:8080
4.5.6.7, lol.net, http:port:8080
...
$ cat ini/http-retry-https-8080.ini
[http]
name = http:port:8080
port = 80
trigger = "http:port:8080"
endpoint = "/"
user-agent = "Mozilla/5.0 (MSIE 10.0; Windows NT 6.1; Trident/5.0)"
timeout = 15
max-size = 16
cipher-suite = portable
redirects-succeed = True
fail-http-to-https = True
with-body-size = True
max-redirects = 5
retry-https = True
It's possible to put all of the configs and targets into a single file each, but as I think I mentioned, at some point zgrab2 blows up as it wasn't intended to be used quite like that. It's really designed to operate on a single port per-run
assuming you have all of those file pairs generated, then you just iterate through them and invoke them, something via something like this (set SENDERS to whatever you prefer to use, depending on your target and network. you'll have one command like this for each port (and one target and ini file for each port found open)
Notice that it uses the multiple
module, not the http
module. The http
module is specified inside of the ini file
SENDERS=500 zgrab2 -f "/targets/http-retry-https-8080.csv" -m "metadata/http-retry-https-8291.meta" -o "output/http-retry-https-8291.ndjson" -s $SENDERS -l "log/http-retry-https-8291.log" multiple -c "ini/http-retry-https-8291.ini"
Unfortunately I'm a bit pressed for time but if it's helpful and I find some time later I can try to put the key parts of the code into a gist or repo. I autogenerate everything programmatically, including a convenient fs structure for the zgran inputs/outputs, and a few scripts to manipulate them to allow running successive but slightly reconfigured zgrab2 sessions against them
@mzpqnxow Is there a specific reason to set senders to 500? I am trying to increase the level of concurrency to 1000, 2000, 5000, etc. However, I am not seeing an increase in my traffic volume. I have eliminated all dns queries when doing the zgrab scan, so it is not a dns issue. (I configed the number of fd as well, and neither cpu or memory is throttling.)
@mzpqnxow Is there a specific reason to set senders to 500? I am trying to increase the level of concurrency to 1000, 2000, 5000, etc. However, I am not seeing an increase in my traffic volume. I have eliminated all dns queries when doing the zgrab scan, so it is not a dns issue.
(I configed the number of fd as well, and neither cpu or memory is throttling.)
There's a senders flag to zgrab2, you can try using that. The SENDERS environment variable may be something specific to a fork I have, I don't recall
@mzpqnxow I am aware of the flag. This issue is when I increase the number of senders to more than 1000, I cannot see an increase in network traffic volume. I am wondering what is the cause of this.
@mzpqnxow I am aware of the flag. This issue is when I increase the number of senders to more than 1000, I cannot see an increase in network traffic volume. I am wondering what is the cause of this.
How are you measuring it? When you say volume do you mean packets, connections or volume (bytes)
EDIT: You can try stracing it to see if there are any obvious failures, I guess? Watch the output of htop, see if all of your CPUs are spinning at the max? all I can think of is basic stuff like this you may have already considered
@mzpqnxow I am aware of the flag. This issue is when I increase the number of senders to more than 1000, I cannot see an increase in network traffic volume. I am wondering what is the cause of this.
I don't use this many senders typically (and it would of course depend on your CPU) but is it possible you've just hit the ceiling of what your hardware can do?
There was another ticket on the topic #221 but since I'm unable to reopen I had to create a new one. Simply said, by running zgrab2 with a large list of domains, seems to result in overloading the DNS server, while the server under which I'm running zgrab2 is having ulimit set to unlimited.
{"domain":"asnpart.net","data":{"http":{"status":"connection-timeout","protocol":"http","result":{},"timestamp":"2022-01-29T22:44:12+01:00","error":"dial tcp: lookup asnpart.net on 185.12.64.1:53: dial udp 185.12.64.1:53: socket: too many open files"}}}