machawk1 / wail

:whale2: Web Archiving Integration Layer: One-Click User Instigated Preservation
https://matkelly.com/wail
MIT License
346 stars 33 forks source link

Add ability to pause/resume crawls without destroying job #218

Open machawk1 opened 8 years ago

machawk1 commented 8 years ago

The API provides hooks for this. https://webarchive.jira.com/wiki/display/Heritrix/Heritrix+3.x+API+Guide#Heritrix3.xAPIGuide-PauseJob

machawk1 commented 8 years ago

https://github.com/WilliamMayor/hapy also has py hooks to Heritrix, while would be more modular than calling the local Heritrix API directly.