ScottMansfield / widow

Distributed, asynchronous web crawler
GNU Lesser General Public License v2.1
26 stars 4 forks source link

Implment rate-limiting on a per-host basis #10

Open ScottMansfield opened 9 years ago

ScottMansfield commented 9 years ago

This should be out-of-process and survive restarts. Ideally centralized somehow and off-box.

10 concurrent connections per box might be a bit too much, but we'll see at larger scale.

ScottMansfield commented 9 years ago

by host I mean Host header, so really DNS name