palewire / archiveis

A simple Python wrapper for the archive.is capturing service
https://archive.is
MIT License
188 stars 16 forks source link

Screenshots and ZIP-files retrieval now available #11

Closed adbar closed 4 years ago

adbar commented 5 years ago

I forgot to update the tests, sorry, I can do that later if necessary.

palewire commented 5 years ago

Thanks for the PR. I might need a summary of everything going on here.

adbar commented 5 years ago

The archive.is page renders (low-resolution) screenshots and includes links to the downloaded files in ZIP format. I would need to download the former in particular, so I decided to add options to the main function and command-line script. The HTTP requests work better as a function to make the code shorter. In addition, a sleeping time is needed to let archive.is render the page before requesting the screenshot. Now that I think of it doing it in one go may not be the best option, it's also feasible to get screenshots and ZIP-files for a given archive ID at a later time. I hope the pull request is clearer now, please don't hesitate if you have further questions.

palewire commented 4 years ago

Rather than add these features to the capture routine, I'd prefer for them to be their own retrieval functions that could be run given an archive.is URL as the input.