Suggestion to minimize false positive subdomains

SolomonSklash / chomp-scan

A scripted pipeline of tools to streamline the bug bounty/penetration test reconnaissance phase, so you can focus on chomping bugs.

https://www.solomonsklash.io/chomp-scan-update.html

GNU General Public License v3.0

393 stars 76 forks source link

Suggestion to minimize false positive subdomains #50

Closed Sy3Omda closed 5 years ago

Sy3Omda commented 5 years ago

i think this script need some tool or bash script to filter All_resolved_domain.txt before nikto scan it, because its some times generate false positive or wildcard subdomains which is not running any web server in fact and this make nikto take so long to scan non excising or wildcard subdomains

SolomonSklash commented 5 years ago

Initially there was no all resolved domains list, which was causing all the tools to hang on non-existent domains. So I added the resolved list to prevent exactly this problem. Domains only get added to the list if they have been successfully resolved, so I don't see a good way to further refine the list. Last I checked, nikto didn't have a good way of detecting unreachable domains and quitting early, but I need to look again.

Sy3Omda commented 5 years ago

we could insert the following script inside massdns function which will filter the real active web server based on response status code for curl the subdomain url and sort only 200 status code URLs . while read LINE; do curl -o /dev/null --connect-timeout 5 --silent --head --write-out "%{http_code} $LINE\n" "$LINE" done < all_resolved_domains.txt | grep "^200.*$" | cut -d$' ' -f2 | sed 's=.*://==' | tee all_active_domain.txt BUT all_resolved_domains.txt has to be include full url like https://www.google.com NOT www.google.com So if could figure out solution for this it would be great because you could match your script especially it depend on non existing http:// in the all list

AND BTW this list all_active_domains.txt will replace all_resolved_domains.txt in all content discovery tools, and this would save alot of time which we will brute force only excising subdomains

SolomonSklash commented 5 years ago

So an issue I see is that many domains that are found are not false positive necessarily, they just may not have port 80 or 443 open. I don't want to exclude successfully resolved domains just because they don't have an HTTP port open. Your script above will only find domains with 443 open and return a 200 response code, while excluding potentially many other domains and ports. Another issue is bad DNS resolvers/results, which is why I use a different list of resolvers than comes with massdns. But I don't know a good way to make 100% sure a result is good, which inevitably leads to false positives. Is your issue mainly with nikto? Because I can add a max scan time flag to prevent non-HTTP domains from hanging forever or a long time.

Sy3Omda commented 5 years ago

your suggestion is appreciated BUT the questions is max scan time flag would skip domains which have error based of non-HTTP servers only, OR it would minimize also the scan time of the potential servers . because some servers take too long time to scan couse of it has a lot of vulnerabilities or bugs SO i do not want to skip this ! i hope you got what i meant .

SolomonSklash commented 5 years ago

I know what you mean. I haven't added the ma scan time flag to nikto yet because I don't want it to miss anything on good HTTP domains. It's a tradeoff at this point. The total Chomp Scan time will be longer because of false positives, but to me that is better than missing out on nikto results.

Overall, my intent with Chomp Scan was never to make it the fastest tool possible overall. I did test as many individual component tools in order to find the fastest ones. No reason to make it take longer than necessary. But given all the options Chomp Scan supports, you can make it take days if you enable every tool. I think the best way to use it is to run it multiple times, even concurrently, with each set of options you want. This way you can minimize the scan time of each run, while still getting the best possible results. I will update the wiki with some suggested scanning strategies to reflect this. Also I plan on seeing if I can get some tools to run in parallel to reduce scan times.

As always, thanks for your input!

Sy3Omda commented 5 years ago

thanks for the clarification, I appreciate your time developing this amazing script and sharing your knowledge with the community