sul-dlss / FOLIO-Project-Stanford

Task management for Stanford’s analysis of FOLIO.
2 stars 0 forks source link

Explore using Symphony legacy identifier as HRID for data migration #113

Closed ahafele closed 2 years ago

ahafele commented 2 years ago

Ebsco tooling supports this and would hopefully solve collusion issues. Would persist our SW urls and previous searches. Would we want to prefix with something like Lane has?

dlrueda commented 2 years ago

Excellent news!

Agree that persisting SW urls and previous searches would be good, and I believe Hoover’s ArchiveSpace also relies on the catkey (not sure if it’s link to SW or to Symphony itself that they use it for)

Add to that, the Data Migration subgroup STRONGLY suggested having the HRID be the legacy system ID.

I will consult with Ryan S. at Lane about prefixing with S (but if we don’t need to that would be preferable I think)

shelleydoljack commented 2 years ago

Maybe just prefix with "a". The current SW export has the "a" prefix even though the SW indexing and display code removes it.

dlrueda commented 2 years ago

Talked to Ryan. He much prefers that we prefix with the “a”

dlrueda commented 2 years ago

OK, so this means @jermnelson needs to figure out how to use the aCATKEY to generate the HRID. Maybe ask Theodor if can’t find it in the code/settings?

jermnelson commented 2 years ago

OK, so this means @jermnelson needs to figure out how to use the aCATKEY to generate the HRID. Maybe ask Theodor if can’t find it in the code/settings?

I assigned myself to this issue and will start testing today. I think the preserve001 setting option here for the HridHandling configuration will do what we want but will test to confirm.

jermnelson commented 2 years ago

Adjusting the DAG to use the preserve001 configuration value for the BibsTransformers, HoldingsTransformers, and ItemsTransformer. This results in the CATKEY being set for the Instance HRID but the holdings and item transformers are ignoring the CATKEY and are still using the existing HRID settings (which still causes holdings and items HRID collisions). Here is an example:

While we can manually assign the Holdings HRID and the Item HRID to use the CATKEY , we still would need to account for the case of multiple Holdings and Items records associated with an Instance record. @ahafele do you have an idea on how IndexData is creating Holdings and Items HRIDs for Lane?

ahafele commented 2 years ago

@jermnelson I believe they are using the legacy ID from their MARC Holdings records (they use MARC Holdings differently from us). I'm unsure of how they generate the item ID. Prefixes for each are Instance - L Holdings - LH Items - LI

Example - https://folio.dev.sul.stanford.edu/inventory/view/335411e1-6e67-5ad3-858d-887274c0dde3?qindex=hrid&query=L%2A&sort=title Instance - L327044 Holdings - LH383026 (found here https://docs.google.com/spreadsheets/d/1PKFvcFH9fJABU1E6qgPfbPxAdZc-2Ztn/edit#gid=1570596587) Items - LI437764, LI437764

jermnelson commented 2 years ago

Closing this issue as PR 85 that implements HRIDs for Instances, Holdings, and Items has been merged.