rverton / webanalyze

Port of Wappalyzer (uncovers technologies used on websites) to automate mass scanning.
MIT License
955 stars 137 forks source link

Crawling subdomains #18

Closed H3JFC closed 5 years ago

H3JFC commented 5 years ago

https://github.com/rverton/webanalyze/blob/601271371582c173207cceff042a6699b87c715c/webanalyze.go#L204

It looks like the parseLinks function returns nil if the parsed url is different from the base url (which is great for cases like the following hostname.com & nothostname.com). This works for most cases, but it would be nice to add an option for searching subdomains like app.hostname.com with a base url of hostname.com while crawling. I have some thoughts on a PR and would be happy to PR if there is interest.

rverton commented 5 years ago

Hi @H3JFC, sure, this could indeed be useful. Any ideas and PR’s are of course appreciated :)

Greetings

H3JFC commented 5 years ago

Great @rverton! Do we want to search subdomains by default or should we add a searchSub boolean that we pass around in the Job struct? (and inherently pass into NewOfflineJob, NewOnlineJob, & Init funcs).

rverton commented 5 years ago

I think we can do both here: Adding a searchSub boolean and defaulting to true here. I think this is a sane default value because its often interesting what other technologies a specific domain makes use of.

H3JFC commented 5 years ago

PR https://github.com/rverton/webanalyze/pull/19

rverton commented 5 years ago

Merged #19