harvard-lil / perma

Indelible links
408 stars 73 forks source link

make a plan about perma -> wacz #2981

Open rebeccacremona opened 2 years ago

matteocargnelutti commented 1 year ago

Discussed today - https://hlslil.slack.com/archives/C07URASMC/p1659730375523229

Summary:


Would running a small scale experiment locally on a batch of X archives first, with automated checks to identify edge cases, be a good first step?

matteocargnelutti commented 1 year ago

Related to #3038

rebeccacremona commented 1 year ago

Throwing a thought in this ticket: if we continue to send our stuff to Internet Archive after this migration, we will probably want to continue sending them WARCs instead of WACZs. I think they will prefer to derive their own CDX lines and store them in their own format like they do currently.