sul-dlss / argo

The administrative discovery interface for Stanford's Digital Object Registry
Other
20 stars 5 forks source link

Determine efficient way to bulk unrelease/open/darken/close #4134

Closed mjgiarlo closed 1 year ago

mjgiarlo commented 1 year ago

Part of the QA/stage reset process involves yanking SDR items from Access systems, and our mechanism for this involves unreleasing items (from SearchWorks) and assigning them dark access rights (effectively removing them from stacks).

There are bulk actions to support this in Argo, but we wonder if there may be a more efficient/automatable way to do this (possibly directly in DSA). Figuring that out, or deciding to punt on it and use Argo, is the point of this issue.

justinlittman commented 1 year ago

The DeleteService has methods for deleting files from stacks and purl: https://github.com/sul-dlss/dor-services-app/blob/main/app/services/delete_service.rb#L38-L46

And the UnpublishService: https://github.com/sul-dlss/dor-services-app/blob/main/app/services/unpublish_service.rb

Is that good enough?

mjgiarlo commented 1 year ago

@andrewjbtw @sul-dlss/infrastructure-team ☝🏻

justinlittman commented 1 year ago

Decision from standup on 8/30: @andrewjbtw to write up instructions on how to do this using bulk actions.

mjgiarlo commented 1 year ago

@andrewjbtw Given :point_up:, is it OK if I assign this to you (and toss at In Progress)?

andrewjbtw commented 1 year ago

Putting the instructions in comments. The goal is to make a good faith effort to pull back SDR content from Access systems. The instructions are based on the takedown instructions in Consul, but applied to all of the items in stage/QA.

Note that we are not concerned with Exhibits in sdr-stage/QA.

First step: unrelease SDR objects from SearchWorks

  1. Click on the facet for all items currently released to SW

    Screenshot 2023-09-01 at 2 38 06 PM
  2. Click on the "Bulk Actions" button just above the search results

    Screenshot 2023-09-01 at 2 40 42 PM
  3. From the Bulk actions page, click on "New Bulk Action"

  4. Manage release should be the already selected bulk action

  5. Choose the radio button for "Do not release this object"

  6. Select "SearchWorks" as the release to option (should be already selected)

  7. Click on "Populate with previous search"

  8. Wait for the druid box to be populated. It could take a while for a long list.

    Screenshot 2023-09-01 at 2 45 33 PM
  9. Submit the bulk action

This will update all of the Purls with a tag that says they should be unreleased. This should remove them from FOLIO records (if they have MARC) and from SW. The complete process can take an hour or so. It may not be 100% successful in stage because of data issues but at least it's a good faith effort to clean up.

andrewjbtw commented 1 year ago

Next step: make all the druids dark

When druids are made dark:

A couple of notes specific to the stage/QA reset:

Steps to open the items:

  1. Select all items from the "Object type" facet
  2. Choose "Bulk actions" from the button that's just above the search results
  3. Choose New bulk action
  4. Choose "Open new object versions"
  5. Describe the version as "change rights to dark" (the description doesn't really matter since we'll be deleting)
  6. Click populate with previous search
  7. Wait for the list to populate
  8. Submit the bulk action

This will take quite a while for a large number of druids.

Steps to open the collections - do the same as above but select collections instead of items in the Object Type facet.

Once the druids have been opened, you can change the rights.

  1. Select all the objects that have a processing status of "Opened"
  2. Select bulk actions
  3. Select New bulk action
  4. Select "Set object rights"
  5. Choose "view" as "dark" - this will also make download none
  6. Click "Populate with previous search"
  7. Wait for the druids to populate
  8. Submit the bulk action

When the bulk action completes, close the objects.

  1. Select all the objects that have a processing status of "Opened"
  2. Select bulk actions
  3. Select New bulk action
  4. Select "Close objects"
  5. Click "Populate with previous search"
  6. Wait for the druids to populate
  7. Submit the bulk action

Accessioning will run on all of the druids. The accessionWF "shelve" step will delete the files from Stacks and "publish" will delete the Purls.

andrewjbtw commented 1 year ago

There may be a more efficient way to do all of the above programmatically but those are the steps I would follow to remove content from access systems using Argo.

lwrubel commented 1 year ago

With the new process for unreleasing using FOLIO APIs via a single-threaded queue, I think we should expect that step to take longer. An earlier test had 100 druids taking about 9 mins. So for 7,273 druids that could take ~11 hours.

andrewjbtw commented 1 year ago

I ran unrelease on the 7273 druids on stage last week to see if anyone would notice. So far no one has. It took about 1 hour.

I did notice that about 1500 druids in the list aren't actually released and only show that way because of problems in the indexing logic.

lwrubel commented 1 year ago

That's good to hear! I'm wondering if unrelease is generally faster because of the checks involved? Thanks for testing.