hakluke / hakrawler

Simple, fast web crawler designed for easy, quick discovery of endpoints and assets within a web application
https://hakluke.com
GNU General Public License v3.0
4.41k stars 483 forks source link

Hakrawler return urls of out of scope domains #110

Closed brabbit10 closed 2 years ago

brabbit10 commented 2 years ago

When using hakrawler on a site, it returns out of scoped domains, like Facebook, Google, Youtube, etc. I saw in another issue that this was fixed but it seems this is still an issue.

for example, running hakrawler on "https://ynet.co.il" like this:

echo https://ynet.co.il | hakrawler -u -subs -insecure -d 2 -h "User-Agent: ${DEFAULT_UA}"

will return URLs from google.com, Instagram.com and others

hakluke commented 2 years ago

Hey! The -subs option determines what gets crawled, not what URLs are returned. If there are URLs in a page that go to out-of-scope domains, hakrawler will still print them, but not navigate to them.