joshuaburkhart / skipfish

Automatically exported from code.google.com/p/skipfish
Apache License 2.0
0 stars 0 forks source link

Add ability to exclude pages by content #199

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
Hi. It's a feature request

-X is not enough in some cases. I am testing a site which always responds 200 
OK on any request. Skipfish is quite good at pseudo-404-pages recognition, but 
in this case it fails. It thinks some of pages are real pages, but actually 
it's just 404 stub with code 200. And for some reason it doesn't crawl links 
even on main page (probably its signature matches one of the pseudo-404 
signatures found by skipfish earlier or idk why).

So it would be nice to have ability to provide regex for exluding such pages by 
content (like arachni does). I'm not sure about performance but slower anyway 
better than nothing.

Original issue reported on code.google.com by maxxa...@gmail.com on 29 Nov 2013 at 10:14