Open Phylu opened 3 years ago
Surprisingly I was just thinkingš” about how to add JavaScript library detection to WhatWeb. I'll just dump my thoughts here, so we can kick off a discussion.
We will need:
Things that make JavaScript unique:
Thoughts:
Some questions to consider:
I guess step one is to start collecting JS Library patterns. Ideally we could have patterns that would survive the minify process.
My thoughts here:
Should WhatWeb scan only same-site JS or also remote JS URLs? I suggest to fetch both in order to check for:
Version numbers in the URL Path
Version numbers in the GET Parameter
Version numbers in the JS Files themselves
Should WhatWeb parse JS to discover URLs for other loaded or imported JS files?
I suggest to not do this (at least in the beginning). Of course there is techniques like Google Tag Manager, but as a first step (probably much easier & faster to implement and maintain), all the files that are included directly such as all minified js files from a vendor folder may be fine.
- A headless browser like headless Chrome or Firefox would work to parse and discover JS URLs, but is it too resource heavy?
We have some experience here, and i totally agree with the resource issue. In addition, it will add huge third party dependencies for whatweb.
I guess step one is to start collecting JS Library patterns. Ideally we could have patterns that would survive the minify process.
I would probably try to start with patterns using version numbers, as they are a good way to get information about the used libraries independent from their name
Possible license string & pattern (I will keep the eyes open for more):
* @license Angular v8.0.2\n --> /@license ([a-zA-Z]*) v?([1-9])*\.?([1-9])\.?([1-9])?/
Within the WhatWeb plugins, I have multiple ways to detect frameworks with versions based on regexes in the code or based on the occurrence of certain files. What I would like to do is the following in addition to that:
Many times, these JavaScript files (which could be named main.js or vendor.js contain comments like the following:
Is there a way to implement something like this within a plugin? Or for all existing plugins so that the regexes could be used "recursively" on js pages that are included?