3) If that still does not work: rotate your IP over time (every 30 to 60 requests for instance) thanks to a proxy provider, rotate your user-agent, clear your cookies at the same time.
You should now look random for most of the websites. If you see any more bot mitigation (recaptchas) or specialized anti-scraping services, this could get trickier.
Different solutions to test :
Random pause between each requests Make good use of sessions:
1) Keep the same session for an amount of request (30 to 60)
2) Clear your cookies after 30 to 60 request and change the user agent. Use this simple python framework: https://pypi.org/project/shadow-useragent/
3) If that still does not work: rotate your IP over time (every 30 to 60 requests for instance) thanks to a proxy provider, rotate your user-agent, clear your cookies at the same time.
You should now look random for most of the websites. If you see any more bot mitigation (recaptchas) or specialized anti-scraping services, this could get trickier.
https://stackoverflow.com/questions/59408534/blocked-from-scraping-a-website-with-scrapy