Closed aharris02 closed 4 years ago
clearly the next step is a worker mode where you can have workers running on multiple machines and they report back to the central node that then posts the actual status. So if you have 30 people all running the checker, you have good coverage, and if one person gets blocked, the others will still be ok.
Hi, if you are interested, Scrapoxy 4 is out!
Scrapoxy is a open source proxy aggregator, allowing you to manage all proxies in one place π―, rather than spreading it across multiple scrapers πΈοΈ.
Smartly designed for efficient traffic routing π, Scrapoxy minimizes #bans and boosts success rates π.
The tech stack is built on the latest NodeJS, Typescript, utilizing the NestJS and Angular frameworks.
Here are the key features:
Checkout https://scrapoxy.io/ !
I could play around with that today. Thanks for the update
Description
Left image is a server on AWS. Right is powershell on my local computer. As far as I can tell, the AWS server's requests are blocked because the AWS region IP blocks are easily discoverable.
Possible solution
I'm sure there are other solutions, but adding support for a rotating block of proxied IP addresses would help reduce blocks. This can be done in the app directly by changing the IP address each time the browser session is started and stopped. If we want to change the IP address for each query, that would need to be implemented as a separate proxy server.
https://scrapoxy.io/ https://zenscrape.com/how-to-build-a-simple-proxy-rotator-in-node-js/