Support accessioning WACZ files.

sul-dlss / was-registrar-app

Rails app to organize downloaded web archiving data and trigger preassembly/accessioning when appropriate

0 stars 0 forks source link

Support accessioning WACZ files. #479

Closed justinlittman closed 2 years ago

justinlittman commented 2 years ago

refs #475

Why was this change made? 🤔

WACZs are web archives too ...

How was this change tested? 🤨

⚡ ⚠ If this change involves consuming from other services or writing to shared file systems, test that web archive seed and crawl accessioning (and maybe even SWAP system?) works properly in [stage|qa] environment, in addition to specs. ⚡

Unit

justinlittman commented 2 years ago

Makes sense. The new step hasn't been added to the workflow. https://github.com/sul-dlss/workflow-server-rails/pull/575 does it.

edsu commented 2 years ago

Ok, the workflow step is now running but it triggers an error that you can see in HoneyBadger here:

https://app.honeybadger.io/projects/51141/faults/86367138

It looks like it's expecting to find the zip outside of the druid tree?

/web-archiving-stacks/data/collections/gq319xk9269/20220627165300-wikipedia.wacz

instead of at:

/web-archiving-stacks/data/collections/gq319xk9269/nb/864/zh/2090/20220627165300-wikipedia.wacz

edsu commented 2 years ago

Success accessioning a WACZ: https://argo-qa.stanford.edu/view/druid:yf654rv4725