fabienvauchelles / scrapoxy

Scrapoxy is a super proxy aggregator, allowing you to manage all proxies in one place 🎯, rather than spreading it across multiple scrapers πŸ•ΈοΈ. It also smartly handles traffic routing πŸ”€ to minimize bans and increase success rates πŸš€.
http://scrapoxy.io
MIT License
1.89k stars 232 forks source link

Anyone else experience really high data out costs using AWS? #204

Open usurpertothethread opened 3 years ago

usurpertothethread commented 3 years ago

Hello,

I noticed high data out costs using the t2.nano ami. I made sure that everything was in the same availability group, switched the code to use private IPs and still seeing really high data out costs. Before I dug into another solution (this one seems to work really great for what I am doing), I wanted to see if anyone else has experienced the same thing and if they had a solution,.

To give more info - I am building a webscraper. On average I am processing about 30 requests a minute. I understand the reason for the amount of data, but from my understanding you should only be getting charged for data-out. And the bulk of that data-out is from the proxy back to the main ec2 (which is being transmitted via private IP in the same availability zone).

Any insight or experience with this would be appreciated.

fabienvauchelles commented 7 months ago

An idea could be to use Spot Instances