Add ability to exclude pages by content

Hi. It's a feature request

-X is not enough in some cases. I am testing a site which always responds 200 
OK on any request. Skipfish is quite good at pseudo-404-pages recognition, but 
in this case it fails. It thinks some of pages are real pages, but actually 
it's just 404 stub with code 200. And for some reason it doesn't crawl links 
even on main page (probably its signature matches one of the pseudo-404 
signatures found by skipfish earlier or idk why).

So it would be nice to have ability to provide regex for exluding such pages by 
content (like arachni does). I'm not sure about performance but slower anyway 
better than nothing.

Original issue reported on code.google.com by maxxa...@gmail.com on 29 Nov 2013 at 10:14

Evan-Sa / skipfish

Add ability to exclude pages by content #199