ArchiveTeam / grab-site

The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
Other
1.35k stars 134 forks source link

Add feature to avoid queuing any more URLs #130

Closed ivan closed 6 years ago

ivan commented 6 years ago

...while still processing existing URLs in the queue.

Motivated by crawls of Wikipedia, which quickly become very slow.

ivan commented 6 years ago

Implemented in cdd79287502ce9b46be53c7b96b5c2c6caa27539