Closed Ehsan-U closed 2 weeks ago
Yes, the dupefilter only applies to scheduled requests, not ones downloaded via ExecutionEngine.download()
.
Yes, the dupefilter only applies to scheduled requests, not ones downloaded via
ExecutionEngine.download()
.
What's the ideal approach to handle duplicates in this case? Can use the RFPDupfilter instance in the File pipeline?
Please ask questions about your code on suitable platforms: https://scrapy.org/community/
The built-in RFPDupeFilter is working correctly for normal requests but it is not filtering duplicates requests generated through media pipeline. is this the expected behavior?