urbanadventurer / WhatWeb

Next generation web scanner
https://www.morningstarsecurity.com/research/whatweb
GNU General Public License v2.0
5.58k stars 909 forks source link

Add IIS 8.5 "Under Construction" detection #353

Closed themaxdavitt closed 3 years ago

themaxdavitt commented 3 years ago

Should be pretty self-explanatory. I'm pretty new to WhatWeb and its plugin development practices so please let me know if I can do anything to make reviewing PRs easier for you guys in the future!

urbanadventurer commented 3 years ago

Thanks @themaxdavitt.

Could you add your name to the authors section and bump the version number too so that you get credit for your contribution?

urbanadventurer commented 3 years ago

I noticed that the IIS 8.5 under construction pattern is very similar to the IIS 7.x under construction pattern. Would you be interested in researching what all the common versions of the IIS under construction pages look like?

themaxdavitt commented 3 years ago

Sure! I will add a commit making those changes + research all the different IIS under construction pages soon. :)

themaxdavitt commented 3 years ago

Alright, so I looked on Shodan and found a handful of sites that looked like they fit these and captured the HAR files for them from Chrome: iis_tests.tar.gz

$ ls
iis_10.har  iis_5_1.har  iis_5_1__404.har  iis_6.har  iis_6__404.har  iis_7_5.har  iis_8.har  iis_8_5.har

You can view them in this HAR analyzer.

What do you think we should do from here? Personally I think comparing against MD5 hashes might be easy and so I just wrote a Node.js-based tool to do that:

$ node md5_har.js iis_tests/*
# iis_tests/iis_10.har
https://[redacted]/ 242c23ea412530c7d94b77a7a978c176
https://[redacted]/iisstart.png 7558b529a6a427f886ec405a097ec6fe
# iis_tests/iis_5_1.har
http://[redacted]/  1aa5ac14bd1b37be8a5f6c67e77ba6ea
http://[redacted]/pagerror.gif  fec387a99c6f60e1d7ca801e975e5743
# iis_tests/iis_5_1__404.har
http://[redacted]/test  979b3d197cf71be7f98c9d9e9acb61c0
# iis_tests/iis_6.har
http://[redacted]/  d36ef6356fa2aa546f1da2bb003c17b1
http://[redacted]/pagerror.gif  fec387a99c6f60e1d7ca801e975e5743
# iis_tests/iis_6__404.har
http://[redacted]/test  23d6b92bc7eb100fc1294e6b124b7e75
# iis_tests/iis_7_5.har
http://[redacted]/  dfbd1ee66a4e792349591b88660c0956
http://[redacted]/welcome.png   5aace0054fe556c7d8d17c0af33d679c
# iis_tests/iis_8.har
http://[redacted]/  bc7e06d3d10fc8b8eb20ca44cf186557
http://[redacted]/iis-8.png 16513ac8af24f4970fca52d7e11fe37f
http://[redacted]/ws8-brand.png af8b1f275e79ca3944df1372c62e0e39
http://[redacted]/msweb-brand.png   4bd2d28772e3d0bd7746f435eea2f8c2
http://[redacted]/bkg-gry.jpg   c853d1851da4d0c0c5177013a9b9b094
# iis_tests/iis_8_5.har
http://[redacted]/  dea139153d780fdc018caefdbd600c1c
http://[redacted]/iis-85.png    7558b529a6a427f886ec405a097ec6fe

(I redacted the IP addresses in this comment to help keep them from picked up from search engines but they're still accessible in the HAR files.)

urbanadventurer commented 3 years ago

Using HAR could be a good way to deal with testing of webpages, but I don't know what tools are available. I wouldn't advise going using hashes for these pages except for detecting the image files.

I just made up a bash script with jq to extract the HTML response of each.

for i in `ls *har`
echo $i; jq -r '.log.entries[0].response.content.text ' < $i > $i.txt

Next I used WhatWeb to test which pages would be detected by your updated plugin.

./whatweb -p microsoft-iis  ~/Downloads/tmp_iis/*txt
/Users/andrew/Downloads/tmp_iis/iis_10.har.txt [ Unassigned]
/Users/andrew/Downloads/tmp_iis/iis_6.har.txt [ Unassigned] Microsoft-IIS[Under Construction]
/Users/andrew/Downloads/tmp_iis/iis_5_1.har.txt [ Unassigned] Microsoft-IIS[Under Construction]
/Users/andrew/Downloads/tmp_iis/iis_5_1__404.har.txt [ Unassigned]
/Users/andrew/Downloads/tmp_iis/iis_6__404.har.txt [ Unassigned]
/Users/andrew/Downloads/tmp_iis/iis_7_5.har.txt [ Unassigned] Microsoft-IIS[Under Construction]
/Users/andrew/Downloads/tmp_iis/iis_8.har.txt [ Unassigned]
/Users/andrew/Downloads/tmp_iis/iis_8_5.har.txt [ Unassigned] Microsoft-IIS[Under Construction]

I can see that there is room for improvement with detecting the default IIS page, at least based on just the page response body without taking headers into consideration. If I was serious about updating this plugin I would collect examples from other languages for each version too.

themaxdavitt commented 3 years ago

I'll admit that I'm not super familiar with it, but it looks like there's a list of tools that have adopted the HAR format here (e.g. harx looks like it can extract files from them). That bash script should work fine as long as the entry doesn't have its response base64 encoded (IIRC you should be able to check .log.entries[0].response.content.encoding), which is probably the case for all text formats (whether or not to encode them like that may be an implementation detail, IDK). Either way though this seems like a cool workflow for testing plugins which I will definitely be using, thanks! 👍

You also make a great point about how pages in different languages will totally break the hashing idea, I hadn't considered that. I'm personally not interested in comparing the translations but I might go back and fix the matches for the HARs I sent in the future.

urbanadventurer commented 3 years ago

To be fair, I don't actually know if Microsoft IIS has any regional or language differences in the HTML output.