rverton / webanalyze

Port of Wappalyzer (uncovers technologies used on websites) to automate mass scanning.
MIT License
955 stars 137 forks source link

Doesn't detect all the available applications #5

Closed przmv closed 1 year ago

przmv commented 7 years ago

Looks like webanalyze (with this apps.json) doesn't detect as many applications as AliasIO/wappalyzer:

$ webanalyze -host="http://stackshare.io"
2017/04/19 13:44:32 Scanning with 4 workers.
2017/04/19 13:44:34 [+] http://stackshare.io (1.346715838s):
2017/04/19 13:44:34     - Google Font API        - [17]
2017/04/19 13:44:34     - Nginx  - [22]
2017/04/19 13:44:34     - Express        - [18 22]
2017/04/19 13:44:34     - Ruby on Rails  - [18]
$ docker run --rm wappalyzer/cli http://stackshare.io | jq '.applications | .[] | .name'
"Algolia Realtime Search"
"AngularJS"
"Express"
"Handlebars"
"Intercom"
"List.js"
"Mailchimp"
"Moment.js"
"New Relic"
"Nginx"
"React"
"Segment"
"Snap.svg"
"SweetAlert"
"Twitter Bootstrap"
"UserVoice"
"Varnish"
"jQuery"
"Node.js"
rverton commented 7 years ago

Wappalyzer makes us of a javascript environment to execute some javascript checks on the loaded page. We can't do this here without adding a bridge to phantomjs or a headless browser. Maybe we can add an optional feature to include headless chrome/firefox in the future. I'll think over it.

hbakhtiyor commented 7 years ago

maybe to use more lightweight version, like https://github.com/scrapinghub/splash which https://github.com/spectresearch/detectem uses

przmv commented 7 years ago

@rverton I'd like to help you with this issue. I'm interested in PhantomJS integration, since it's easier to install on servers. Let's discuss how it could be implemented, so I could start working on it and hopefully send a pull request in the nearest future.

hbakhtiyor commented 7 years ago

@pshevtsov Phantomjs is heavy and stopped maintaining

przmv commented 7 years ago

@hbakhtiyor what do you suggest instead? I need something that is easy to install for the end users and is cross-platform — just like static PhantomJS binaries.

hbakhtiyor commented 7 years ago

https://github.com/scrapinghub/splash, using docker for easy installation, or headless chrome/firefox

rverton commented 7 years ago

The problem is see here is that including an external tool will have a big impact on performance. So if we implement this, we need to make this optional.

I already included phantomjs in a go project some time ago (https://github.com/rverton/xssmap), but @hbakhtiyor may be right: It looks like PhantomJS is stopped in the (near) future in favor of Chrome/FF headless browser support.

Maybe it's worth making a test run and implementing selenium and compare the performance results? What do you think ?

przmv commented 7 years ago

@rverton @hbakhtiyor Using headless Chrome or Firefox seems like a decent solution for desktop users (since it's already there), but the Go application I'm currently working on is mostly targeted at servers, and having huge GUI application like Chrome or Firefox as a CLI tool dependency looks like an overkill.

rverton commented 7 years ago

The question we have to ask is, if its worth implementing this and invoking a different renderer, because then we could also just build a wrapper around the already existing wappalyzer phantomjs driver: https://github.com/AliasIO/Wappalyzer/tree/master/src/drivers/phantomjs

It may be worth to make a little test implementation and see if we may be able to perform still better, but I have to say I'm a bit skeptical.

hbakhtiyor commented 7 years ago

@rverton would be nice to make it optional, using CDP Client @pshevtsov don't need any GUI, for running headless mode, Chrome, FF

rverton commented 7 years ago

@hbakhtiyor but you need to install a full chrome/ff, which may require a lot of other stuff to be installed.

I dont have the time currently to test one of this approaches, if someone wants, feel free to send me PRs.

j3ssie commented 5 years ago

@rverton did you check out this awesome lib https://github.com/chromedp/chromedp

5amu commented 3 years ago

Hello @rverton!

First of all, I really like this tool and I'd like to use it for work too.

I'm not very familiar with wappalyzer code, but if the problem is that you need to execute Javascript, this might be an easy enough fix https://github.com/rogchap/v8go, otherwise I'd be very glad if anyone could tell me where the JS execution is needed and I can implement it myself and make a PR.

rverton commented 3 years ago

Hi @5amu,

sadly its not that easy in this case because it's not just javascript missing here, its the whole (browser) DOM which is missing when not using a browser. Maybe there is a way to emulate this with some libs, but I don't know of any. If we make use of a headless browser, the performance speed we gain by using a not-browser approach is gone.

If you want to go this route I guess its easier to wrap the wappalyzer script or docker container. This will be slower by a huge margin, but more precise because it can detect client side javascript stuff.

Greetings

bugbaba commented 3 years ago

Hi @5amu,

wappalyzer code is heavily documented you can easily refer to this https://github.com/AliasIO/wappalyzer/tree/master/src/drivers/npm to get started with a node binary in your system, if you don't want to use docker. Also along with performance concerns if this browser support is added to webanalyze project it would then just become a wappalyzer rewritten in golang which doesn't make huge difference in performance.

-- Regards, @bugbaba

5amu commented 3 years ago

Thanks @bugbaba @rverton for the clarification,

I'll keep an eye on the project to see if someone, eventually, will come up with an idea to solve this issue without many performance penalties. For now, I'll keep using wappalyzer in docker.

Best of luck!