Open santoshbs opened 3 years ago
This isn't supported by the current API, but adding an -l/--latest
option seems like a good feature request. In the meantime, you might be interested in scrapy-wayback-machine. That project provides a Scrapy middleware that this project uses under the hood, and it offers a lot more flexibility in terms of customizing behavior.
I'll leave this open as a feature request.
Is there a way that I can get the most recent version (a single version) of a full site crawl of a list of URLs?