bellingcat / auto-archiver

Automatically archive links to videos, images, and social media content from Google Sheets (and more).
https://pypi.org/project/auto-archiver/
MIT License
578 stars 60 forks source link

archive facebook with archive.ph #26

Closed msramalho closed 2 years ago

msramalho commented 2 years ago

https://archive.ph/ does not have an API like the Internet Web Archive Wayback machine, although it can archive facebook pages and IWA cannot. Could we use selenium to submit links in the archive.ph UI and thus successfully archive links?

loganwilliams commented 2 years ago

Only if we can get past a captcha, which is non-trivial. Also, the operator of archive.today clearly does not want people submitting links in an automated way, which arguably is worth respecting. I wonder if there's a way of making this simpler for researchers though, like a tool that would submit links to the spreadsheet and archive.today at the same time, so it's technically not an automated process but based on manual input. (Would also have someone to solve the captcha).