Closed jamesliu668 closed 6 years ago
Scrapy Cluster uses a customized scheduler, and therefore needs a customized deduplication filter due to its distributed nature. You are free to modify the RFPDupeFilter implementation to meet your needs, but keep in mind it may not give you the behavior you expect.
Hello,
From the scrapy project, I find that I can change the setting DUPEFILTER_CLASS to overwrite the class, however, this setting doesn't work in scrapy-cluster any more.
Here is the setting in scrapy: DUPEFILTER_CLASS