istresearch / scrapy-cluster

This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
http://scrapy-cluster.readthedocs.io/
MIT License
1.18k stars 324 forks source link

Redis Monitor Locks for processing #43

Closed madisonb closed 8 years ago

madisonb commented 8 years ago

In order to scale the Redis Monitor horizontally, we need to be able to ensure that only a single redis monitor process is operating on a key. This involves adding an additional RedLock locking mechanism around the key that is being processed.

This eliminates the Redis Monitor single point of failure.

madisonb commented 8 years ago

Work on this should also allow the Redis Monitor to spawn non-blocking jobs to complete the required action, otherwise larger jobs can block requests within a particular instance. Use Multiprocessing for this.

madisonb commented 8 years ago

Note that non-blocking jobs and the plugin architecture used don't really go hand in hand. Instead of using multiprocessing I opted instead to allow the scaling of independent Redis Monitor processes. It serves the same purpose.