sul-dlss / dor-services-app

A Rails application exposing Digital Object Registry functions as a RESTful HTTP API
https://sul-dlss.github.io/dor-services-app/
Other
3 stars 2 forks source link

update accessioning "cleanup" rake tasks to work with new versioning model #5145

Closed andrewjbtw closed 1 month ago

andrewjbtw commented 1 month ago

Background

There are two rake tasks in DSA that "stop" an in-progress accession and carry out associated cleanup steps. The goal of these rake tasks is to make it easier to fix accessioning jobs that have run into errors that cannot fixed while any workflows are in progress. These errors are almost always related to problems with content files: unprocessable filename characters, corrupt files, missing files, checksum mismatch failures, and so on.

The process for fixing these errors involves:

These rake tasks were developed back when the workflows determined the accessioning status. Removing the latest accessioning workflows was sufficient to allow the item to be opened or accessioned again, even if they didn't perfectly reset the item to the state it had before the accessioning began.

Issue after moving to the new versioning model

The cleanup task does not handle the versioning and the accessioning status anymore. The result is that after the cleanup task runs, an item still can't be opened or accessioned again because there's a mismatch between what the current version is in DSA and what preservation expects.

See this item as an example: https://argo.stanford.edu/view/druid:bf366gz2618

I ran the cleanup rake task on that item but it is still version 2 in DSA. I think that's why Argo shows the status as "v2 Unknown Status": the workflows no longer show any data for v2 but DSA/Argo still have data that says it should be v2.

Attempting to open that item in this state makes DSA try to open version 3, which fails because of a mismatch with preservation (which expects version 2). Back when the workflows governed versioning and accessioning status, a quirk in how that status was determined made it possible to open version 2 again. In the new model, we probably have to set the version back to version 1 to make it possible to create a new version 2.

Proposal

Minimum option

Add decrementing the version number to the rake cleanup tasks. That is, the tasks will continue to do all the same things, plus set the version number back to the previous one, which was the version of the last successful accession.

In this option, the Cocina would be left alone and would still reflect whatever was changed in order to make the failed version 2.

More extensive option

Cover the minimum option and further restore the Cocina to be what it was at the time immediately before the failed versioning attempt. This should correspond to the cocina for the last accessioned version. This would be roughly equivalent to discarding a draft of changes to the item.

The more extensive option is the cleaner reset. But I'm not sure we have the data to support it yet, at least with legacy objects. The minimum option would get to a point where someone like me could try to fix the item, but we'd have to be very careful to make sure the cocina for the failed version didn't cause the same failure to recur.

Impact

Without some way to reset items back to a state where they can be accessioned again, I will end up filing tickets for the FR or other developer to reset individual items on an as-needed basis. There are currently 3 items where I need this assistance.