sul-dlss / was-registrar-app

Rails app to organize downloaded web archiving data and trigger preassembly/accessioning when appropriate
0 stars 0 forks source link

Support manual crawl accessioning via WRA #462

Closed mjgiarlo closed 2 years ago

mjgiarlo commented 2 years ago

Preconditions:

  1. User has created a collection in Argo (unless one already exists).
  2. User has copied WARCs to was_unaccessioned_data/jobs.

Steps:

  1. User will complete a form providing the necessary information to complete a one-time. This includes: collection, title/label, APO, filepath, source id.
  2. WRA will register an item and initiate the wasCrawlPreassembly workflow.

In addition: User will be able to view previous one-time crawl accessioning. This should include the druid of the registered item.

edsu commented 2 years ago

Is it possible for WRA to present a list ot file locations to choose from instead of asking for them to be entered by hand?

justinlittman commented 2 years ago

There is nothing that dictates the directory hierarchy in the scratch space, so I'm reluctant to implement a file browser for the initial implementation -- perhaps that can be a separate enhancement?

However, I will be validating the entered filepath.

edsu commented 2 years ago

As long as it is being validated prior to registration I think that will suffice.

peterchanws commented 2 years ago

I submitted a job with 2 warc files under the one-time warc registration. The job appear in the list with status and DRUID created. Click on DRUID bring me to the page showing details for the DRUID. Everything seems fine.

lwrubel commented 2 years ago

Thanks for testing, @peterchanws. Wanted to confirm you saw that the title of the object will be taken from the name of the directory within the was_unaccessioned_data/jobs directory. That is in line with how Archive-It collections are put on the filesystem and their crawl objects named.

peterchanws commented 2 years ago

Yes. I saw that. Thanks.