wet-boew / wet-boew-wpss

The Web and Open Data Validator (formerly the WPSS Validation Tool) provides web developers and quality assurance testers the ability to perform a number of web site, web page validation and Open data validation tasks at one time.
https://github.com/wet-boew/wet-boew-wpss/wiki
Other
30 stars 18 forks source link

Provide a setting to adjust the speed of crawl per page #117

Open DavidMacDonald opened 4 years ago

DavidMacDonald commented 4 years ago

Sometimes I get filtered as a spam bot if I'm crawling a site, and this is sent to a central service which forbids my ID from many sites.

I think we could solve this by slowing down the call rate. I think it is about 1-2 seconds a page now maximum. It would be good to be able to adjust it over a wide range.

DavidMacDonald commented 4 years ago

Some web sites have automated monitors that log the IP address of sources that fire a lot of web calls in a short amount of time. There are services that large corporations subscribe to where this information is shared. When an IP gets flagged then all the sites that subscribe to the service block the IP. I've found myself getting blocked for weeks from shopping sites etc...