Rate limiting - Githubissues

izhekov commented 2 years ago

There was another ticket on the topic #221 but since I'm unable to reopen I had to create a new one. Simply said, by running zgrab2 with a large list of domains, seems to result in overloading the DNS server, while the server under which I'm running zgrab2 is having ulimit set to unlimited.

{"domain":"asnpart.net","data":{"http":{"status":"connection-timeout","protocol":"http","result":{},"timestamp":"2022-01-29T22:44:12+01:00","error":"dial tcp: lookup asnpart.net on 185.12.64.1:53: dial udp 185.12.64.1:53: socket: too many open files"}}}

MC874 commented 2 years ago

Been perform scan multiple times on huge domain list because of this. End up using WARP VPN to solve, however it's also still misses couple endpoints on some domain.

mzpqnxow commented 2 years ago

There was another ticket on the topic #221 but since I'm unable to reopen I had to create a new one. Simply said, by running zgrab2 with a large list of domains, seems to result in overloading the DNS server, while the server under which I'm running zgrab2 is having ulimit set to unlimited.

Which value are you setting to unlimited? Just to confirm, the relevant setting is 'nofile'

FWIW, I previously had an issue using 'unlimited' in 'limits.conf' for the soft and hard limits for 'nofile', despite 'unlimited' working for all of the other rlimit settings. I ended up using a large integer value instead. 2 << 18 (256k) should be plenty

this link may be helpful in trying this out. Ensure you have the 'nofile' set to a large integer in 'sysctl', 'limits.conf', and your session and then try again. You'll meed to log out and back in again depending on where the change was necessary. If it's only a 'sysctl' issue you won't need to logout, it will be instant. For 'limits.conf'/PAM changes, you'll need a new session

Related to this, I've also encountered (on some systems) an issue where 'systemd' was effectively undoing 'limits.conf'. I had to make changes as described here

{"domain":"asnpart.net","data":{"http":{"status":"connection-timeout","protocol":"http","result":{},"timestamp":"2022-01-29T22:44:12+01:00","error":"dial tcp: lookup asnpart.net on 185.12.64.1:53: dial udp 185.12.64.1:53: socket: too many open files"}}}

Alternately, you can approach this differently. It may add value to your data as well, particularly because you're using the HTTP module

I don't delegate DNS resolution to 'zgrab2'- except where necessary, specifically where the HTTP module encounters a 301/302 to an FQDN, which require a runtime lookup

I prefer that DNS be done separately, to avoid additional cycles and network traffic during the zgrab2 tx loop. It also has the effect of reducing number of FDs used

What I do specifically is pre-resolve all of the FQDNs and then use the CSV-style targets format supported by 'zgrab2':

...
<fqdn>, <ip>, <tag/trigger>
...

Doing it this way has benefits in terms of comprehensiveness, especially for the HTTP module, which uses the FQDN from the target line for both the SNI server name value (if target is using TLS) as well as the HTTP Host header

For example, by pre-resolving each FQDN and generating a comprehensive targets CSV for each A response with a script (I use 'massdns' and a small Python script) you can ensure 'zgrab2' probes ALL of the logical targets (IP addresses) for those FQDNs that have multiple A record values. It also ensures that the target is probed with an invalid/default virtual host (Host header) value- in this case, the IP address

This allows you to identify things like inconsistencies in DNS load-balanced sites as well as issues associated with poorly configured name-based virtual hosting configurations

In the case of name-based virtual hosting (where the request may be routed differently based on the SNI name as well as the HTTP Host header) you will likely come across something "interesting" on a non-negligible number of endpoints when using the IP for the FQDN (a common one in my experience is administrative consoles) as opposed to the expected web property you will see when using the proper FQDN value

Here's an example of the targets CSV I produce which uses pre-resolved data as the input. This provides the perks mentioned above and also happens to work around your issue, since the name/IP are already specified

This example is for a single FQDN with two unique IP addresses in an A lookup. You'll see it generates 4 requests, rather than just 1, to get comprehensive coverage:

bah.com, 1.2.3.3, http80
bah.com, 1.2.3.4, http80
1.2.3.3, 1.2.3.3, http80
1.2.3.4, 1.2.3.4, http80

I realize this isn't necessarily a direct fix for your issue, but I figured it's an accidental workaround that (IMO) enhances the comprehensiveness of your output.

The drawback to this approach is the increase in total number of requests, the magnitude of which will depend entirely on your targets

If you're only concerned with reducing DNS lookups, then you could use the following:

bah.com, 1.2.3.4, http80

Some of this you may already be familiar with, but hopefully something here was helpful. Sorry for typos, this was written on mobile :>

MC874 commented 2 years ago

I'm interested on your workaround involving massdns could you assist me with some sample? I've been using default STEWS script but it seems to bloat the DNS Queries and misses most of the endpoint from a huge domain list. @mzpqnxow

mzpqnxow commented 2 years ago

I can at least point you in the right direction to save you some time. Unfortunately what I have to do this is very tightly integrated into a project I'm not able to share, but I'll see what I can provide that may be helpful. In a nutshell it's a few steps and assumes your only input is a list of DNS names. It depends on massdns, jq, masscan, zgrab2, groupcidr and python3 with pandas

If all you really need to know is the format of the target list and ini files, that's easy and can be found below. I tried to briefly outline the end to end process, though...

The summary

resolve DNS names with massdns, use ndjson format for output
extract the IP addresses from the massdns results and supernet/coalesce them with groupcidr
perform a masscan on those ips, ultimately to masscan ndjson format (via -oD out.ndjson, or if converting from -oB format, --readscan out.bin -oD out.ndjson)
merge the massdns results into the masscan results as a new field in each of the NDJSON rows (fqdn), optionally exploding FQDNs into multiple rows if to get fully coverage of FQDNs with multiple corresponding IP addresses; this bit is all pandas and is tricky if you're not familiar with pandas or dask; you could do it with loops and dicts but it's a lot more code and slow with really, really large datasets
once merged into NDJSON rows with the added fqdn, you have fqdn, ip and port all available in each row, making it easy to process to produce the zgrab2 ini and targets files
process the merged file and produce an ini and target list file for each port. for tag/trigger, use something like port<port>. The end result of this is a set of file pairs, one pair for each port- a target file and ini file. example below

massdns

this is nothing special, just make sure it's emitting ndjson

#!/bin/bash
declare -r RESOLVERS_PATH=~/resolvers.lst
declare -r TIMESTAMP=$(date +%Y%m%d.%S)
declare -r infile="$1"
declare -r outpath="${infile}.out.$TIMESTAMP"
declare -r errfile="${outpath}/errors"
declare -r outfile="${outpath}/resolved"
declare -r statsfile="${outpath}/stats"
declare -r socket_count=2
declare -r hashmap_size=4000
declare -r interval=250
declare -r processes=1
declare -r resolve_count=30
...
massdns \
  --interval $interval \
  --retry REFUSED \
  --retry SERVFAIL \
  --hashmap-size $hashmap_size \
  --socket-count $socket_count \
  --error-log "$errfile" \
  --resolve-count $resolve_count \
  -o J \
  -w "$outfile" \
  -r "$RESOLVERS_PATH" \
  $infile

zgrab2 target and ini file pairs

$ cat targets/http-retry-https-8080.csv 
1.2.3.4, bah.com, http:port:8080
4.5.6.7, lol.net, http:port:8080
...
$ cat ini/http-retry-https-8080.ini
[http]
name = http:port:8080
port = 80
trigger = "http:port:8080"
endpoint = "/"
user-agent = "Mozilla/5.0 (MSIE 10.0; Windows NT 6.1; Trident/5.0)"
timeout = 15
max-size = 16
cipher-suite = portable
redirects-succeed = True
fail-http-to-https = True
with-body-size = True
max-redirects = 5
retry-https = True

It's possible to put all of the configs and targets into a single file each, but as I think I mentioned, at some point zgrab2 blows up as it wasn't intended to be used quite like that. It's really designed to operate on a single port per-run

assuming you have all of those file pairs generated, then you just iterate through them and invoke them, something via something like this (set SENDERS to whatever you prefer to use, depending on your target and network. you'll have one command like this for each port (and one target and ini file for each port found open)

Notice that it uses the multiple module, not the http module. The http module is specified inside of the ini file

SENDERS=500 zgrab2 -f "/targets/http-retry-https-8080.csv" -m "metadata/http-retry-https-8291.meta" -o "output/http-retry-https-8291.ndjson" -s $SENDERS -l "log/http-retry-https-8291.log" multiple -c "ini/http-retry-https-8291.ini"

Unfortunately I'm a bit pressed for time but if it's helpful and I find some time later I can try to put the key parts of the code into a gist or repo. I autogenerate everything programmatically, including a convenient fs structure for the zgran inputs/outputs, and a few scripts to manipulate them to allow running successive but slightly reconfigured zgrab2 sessions against them

huanchen-stack commented 1 year ago

@mzpqnxow Is there a specific reason to set senders to 500? I am trying to increase the level of concurrency to 1000, 2000, 5000, etc. However, I am not seeing an increase in my traffic volume. I have eliminated all dns queries when doing the zgrab scan, so it is not a dns issue. (I configed the number of fd as well, and neither cpu or memory is throttling.)

mzpqnxow commented 1 year ago

@mzpqnxow Is there a specific reason to set senders to 500? I am trying to increase the level of concurrency to 1000, 2000, 5000, etc. However, I am not seeing an increase in my traffic volume. I have eliminated all dns queries when doing the zgrab scan, so it is not a dns issue.

(I configed the number of fd as well, and neither cpu or memory is throttling.)

There's a senders flag to zgrab2, you can try using that. The SENDERS environment variable may be something specific to a fork I have, I don't recall

huanchen-stack commented 1 year ago

@mzpqnxow I am aware of the flag. This issue is when I increase the number of senders to more than 1000, I cannot see an increase in network traffic volume. I am wondering what is the cause of this.

mzpqnxow commented 1 year ago

@mzpqnxow I am aware of the flag. This issue is when I increase the number of senders to more than 1000, I cannot see an increase in network traffic volume. I am wondering what is the cause of this.

How are you measuring it? When you say volume do you mean packets, connections or volume (bytes)

EDIT: You can try stracing it to see if there are any obvious failures, I guess? Watch the output of htop, see if all of your CPUs are spinning at the max? all I can think of is basic stuff like this you may have already considered

mzpqnxow commented 1 year ago

@mzpqnxow I am aware of the flag. This issue is when I increase the number of senders to more than 1000, I cannot see an increase in network traffic volume. I am wondering what is the cause of this.

I don't use this many senders typically (and it would of course depend on your CPU) but is it possible you've just hit the ceiling of what your hardware can do?

zmap / zgrab2

Rate limiting #342

massdns

zgrab2 target and ini file pairs