Closed tpendragon closed 6 months ago
Need to chat with the finding aids POs & the Figgy PO about this one.
The question is: If an ARK is associated with a Figgy record, is it okay to always change its destination when the record is marked complete.
POs have requested a report of all the resources which have MMS-ID source metadata identifiers and an ARK which points at findingaids, to be able to tell which arks would end up getting changed.
We can tell if the ARK points to findingaids by resolving the ARK. You don't have to follow the redirect - just see if it's a redirect and look at the LOCATION header.
The report is here: https://github.com/pulibrary/figgy/files/15180044/ark_mismatch_report.csv (thanks @hackartisan !)
Messaged @faithc and @ccleeton to see if can say if all of those ARKs can point at their catalog counterparts and that's fine.
The reports appear to be cumulative, so I believe we only need to run the metadata refresh on the last report. I thought all the resources were likely in the same collection, which would mean we can just use the bulk update UI for the refresh, but I did a little analysis and they are not all in the same collection.
I fetched all the resources IDs using the mmsids from the spreadsheet. some of them weren't found in figgy, leaving 451 resources. I tried updating one with change_set.validate(refresh_remote_metadata: "1") and saving it but that didn't update the ark. I tried that same one by using the check box in the UI and that did update the ark. I fed them all into CatalogUpdateJob
but the arks didn't get updated. Finally I pulled out just the ones with state == ['complete']
and ran those through IdentifierService.mint_or_update. I spot-checked a few and this worked.
I think what tripped up the process was that the change set persister only updates identifiers when the state, title, or source metadata identifier are changed. I do not know how those values read as changed when you submit the form, but they definitely don't when you try to re-set them to their original value in the terminal.
21 resources had a state other than "complete" and as expected none of those had an ark yet in its identifier field. once those are complete they should update the ark.
Closing as fixed.
Message from Hilary Murusmith:
I did a little early searching around. This is because we explicitly don't update ARKs if the ARKs point at finding aids:
https://github.com/pulibrary/figgy/blob/b64b039b6690d5505a4c1b37417871456b5a528b/app/services/identifier_service.rb#L41
Which seemed odd, but we added it because we wanted to make sure we didn't accidentally get rid of a findingaids pointer in an old version of Figgy:
https://github.com/pulibrary/figgy/issues/1727
I'm not sure if this is still a problem. If it isn't, then we should just get rid of that restriction. If it is, then we should add a way to bypass that requirement.
Steps
Success Criteria
All of the resources in #6232 have an ARK that points to the catalog. For example, https://arks.princeton.edu/ark:/88435/3j333601q
Sudden Priority Justification
If the ARKs for these point to the wrong place, citations for this content will go to the wrong place. The sudden priority process is the way to get these kinds of issues looked at.