VocaDB / vocadb

VocaDB is a Vocaloid Database with translated artists, albums, music videos and more.
https://vocadb.net
Other
347 stars 39 forks source link

Archive external links when updating entries #673

Open Nefere256 opened 3 years ago

Nefere256 commented 3 years ago

Splitted from https://github.com/VocaDB/vocadb/issues/656.

VGMdb recently added link archiving feature:

Submitted and edited links are automatically queued for archiviation on the Wayback Machine, preventing important data and sources from being lost forever. This also applies to links used in the comment field mentioned above.

New links with archived pages have a Wayback Machine icon that links to an archived version of a website. An example with TouhouDB link.


An available API mentions only ways to lookup archived pages. Internet Archive blog post sugests using a save form on the main page of the project.

ycanardeau commented 2 years ago

mentioned

szc126 commented 2 years ago

The Internet Archive has an email address for "Save Page Now!", spn@archive.org. It accepts 300 URLs per message IIRC (someone test this).

It sends a log back, either with an URL to web.archive.org or an error message, for each URL. Sometimes it will automatically retry if there are errors and send another log. Sometimes it will ignore your message (maybe a daily limit from one address).


There is also a Google Sheets-based service according to this article.


The "Save Page Now!" page on the IA website has a "Save outlinks" feature that is good, since important data is not necessarily on the given link (example: offvocal.zip is one more click away). I don't think this can be done from the email address.if you add “capture outlinks” to the subject line, those will be preserved as well.

"Save Page Now!" has a limit on how many jobs you can run at once (usually around 3 maximum, maybe because I always choose "save outlinks"?).


Sending media links as well, not only external links, would be good for checking video data at the least (video description; as VocaDB/vocadb#1386). YouTube videos are sometimes functional in the Internet Archive. The other websites aren't (NND's old login wall, heavy reliance on JavaScript, etc), but archive.org accepts submissions apart from the Internet Archive (example: https://archive.org/details/soundcloud-343968920).

szc126 commented 2 years ago

Finding out how VGMdb queues links could also be helpful.