Closed softwarevamp closed 7 years ago
Do you have any documentation where web scraping blockers like CloudFlare or Incapsula block based on sub domain? I have never heard of that, and I have even encountered instances where many sites and sub domains are all orchestrated under one blocker, so if you get banned for scraping one site too hard it propagates and you are banned from other sites as well.
Sub domains don't necessarily have to exist on different servers or use separate scraper blocking software. Regardless, thanks to the TLDExtract library used in this project this could be implemented as an optional flag.
If you can provide an update here that would be great, otherwise I am going to close this soon due to lack of interest and no supporting documentation that anti-scraping providers block based on subdomain.
Closing.
I have requests like these:
Currently the throttle is domain based, but i want subdomain based. Because for large website the servers are separate.