istresearch / scrapy-cluster

This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
http://scrapy-cluster.readthedocs.io/
MIT License
1.18k stars 324 forks source link

Reduce potential Redis key collisions #44

Closed madisonb closed 8 years ago

madisonb commented 8 years ago

Scrapy Cluster may not be the only process that is operating within a Redis Instance. We should add a unique identifier to the beginning of every key used so that doing *:*:queue does not collide with any other potential key being used in the cluster.

I propose using sc: as the identifier, so that every single thing scrapy cluster uses is easily distinguishable from other keys in use. The new query would be sc:*:*:queue.

madisonb commented 8 years ago

Actually, the much easier way to fix this is to use a different Redis DB. commits are 434f80947e3a381696686530739c7f6f27f2f9dd and 47b24a197577660dbcf67b8ec17eff308dc9b7bf to update everything to accept a REDIS_DB var that sets the database to use within Redis.