stevenvachon / broken-link-checker

Find broken links, missing images, etc within your HTML.
MIT License
1.95k stars 302 forks source link

Pause/resume for CLI tool #221

Closed zorael closed 3 years ago

zorael commented 3 years ago

Is your feature request related to a problem? Please describe. I'm trying to check a large-ish site for broken links, and after a while my requests get rate-limited and fail.

[...]

Getting links from: https://mypage.com/collections/things?page=1&sort_by=price-descending
Finished! 139 links found. 139 excluded. 0 broken

Getting links from: https://mypage.com/collections/stuff?sort_by=created-descending
Error: HTML could not be retrieved

The README mentions supporting pause/resume at any time, but it does not seem to be available to the CLI tool.

Describe the solution you'd like Hitting Ctrl+C mid-check would pause it, and invoking the command on the same address would resume it. It would have to store the progress somewhere, perhaps in a temporary file in /tmp?

Describe alternatives you've considered A background script calling kill -STOP $PID after a while, sleeping, and then kill -CONT $PID to resume and as such avoid being rate-limited.

Additional context The tool works great, but it ends early and does not gracefully handle HTML could not be retrieved.

stevenvachon commented 3 years ago

The repository's readme is for the trunk version.