sul-dlss / FOLIO-Project-Stanford

Task management for Stanford’s analysis of FOLIO.
2 stars 0 forks source link

Script to export boundwith data from Symphony #234

Closed dlrueda closed 1 year ago

dlrueda commented 2 years ago

This script finds related child boundwiths for a barcode that is a parent bound with /s/SUL/Bin/Sal3> cat find_bwtfs.pl

in Symphony, BW-PARENT or BW-CHILD will be in item cat1 of item record and 590 of CHILD bib has ckey of parent bib

This is not just for SAL3 (it’s just that the script that does bw stuff is only run now for SAL3, so that’s where it is)

Part of this task is to figure out what data we need from Symphony to make the link. Part of this task is to figure out if we should have a line in the item.tsv file in order to create a holdings or item in FOLIO, or if we do this separately.

We will need to update the initial selection of items from Symphony to extract to include item cat1 = BW-CHILD (which is currently excludes), but we don’t know if we should keep this item in the *.tsv items file afterwards.

shelleydoljack commented 2 years ago

The two models for creating bound-withs have different endpoints. To create a bound-with relationship between the item and holding record, you use the POST /inventory-storage/bound-with-parts endpoint. Per the API docs, this endpoint "Records the relationship between a part of a bound-with (a holdings-record) and the bound-with as a whole (the circulatable item)". The data required to create this relationship is:

"holdingsRecordId": {
      "type": "string",
      "description": "the ID of the holdings record representing a part of a bound-with; a UUID"
    },
"itemId": {
      "type": "string",
      "description": "the ID of the item representing the bind; a UUID"
    }

I tried a POST on folio-dev with two random records: I used the holdingsID for record HRID a349709 (child) and the itemId for record HRID a67207(parent). The bound-with icon shows on the item for HRID a67207 (parent) but no bound-with relationship shows in the holdings for HRID a349709 (child). Per @ahafele the relationship in the child’s holdings comes in a later version.

To create a bound-with relationship between the instances, then we need to use the endpoint PUT /inventory/instances/{instanceId} and pass this data:

"parentInstances": {
      "description": "Array of parent instances",
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "id": {
            "description": "Id of the parent instance",
            "type": "string"
          },
          "superInstanceId": {
            "description": "Id of the super instance",
            "type": "string"
          },
          "instanceRelationshipTypeId": {
            "description": "Id of the relationship type",
            "type": "string"
          }
        },
        "additionalProperties": false,
        "required": [
          "superInstanceId",
          "instanceRelationshipTypeId"
        ]
      }
    },
    "childInstances": {
      "description": "Child instances",
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "id": {
            "type": "string"
          },
          "subInstanceId": {
            "description": "Id of sub Instance",
            "type": "string"
          },
          "instanceRelationshipTypeId": {
            "description": "Id of the relationship type",
            "type": "string"
          }
        },
        "additionalProperties": false,
        "required": [
          "id",
          "subInstanceId",
          "instanceRelationshipTypeId"
        ]
      }
    }

in addition, need to PUT the required fields: source, title, instanceTypeId (rdacontent: resource type term).

shelleydoljack commented 2 years ago

The lotus demo site shows a bound-with icon next to the instance title for all of the instances where the holdings:item bound-with relationships were made (not at the instance level) yet the example in folio-dev I did does not show this. https://folio-lotus.dev.folio.org/inventory/view/85010f04-b914-4ac7-ba30-be2b52f79708?qindex=hrid&query=bw%2A&segment=instances&sort=title

Doing an inspect to see the response from the demo site's okapi shows this data:

{
  "totalRecords": 3,
  "instances": [{
    "id": "cd3288a4-898c-4347-a003-2d810ef70f03",
    "title": "Elpannan och dess ekonomiska förutsättningar / av Hakon Wærn",
    "contributors": [{
      "name": "Wærn, Hakon",
      "contributorNameTypeId": "2b94c631-fca9-4892-a730-03ee529ffe2a",
      "primary": true
    }],
    "publication": [{
      "publisher": "Svenska Vattenkraftföreningen",
      "dateOfPublication": "1942"
    }],
    "staffSuppress": false,
    "discoverySuppress": false,
    "isBoundWith": true
  }, {
    "id": "85010f04-b914-4ac7-ba30-be2b52f79708",
    "title": "Metod att beräkna en index för landets vattenkrafttillgång / av Åke Rusck och Gösta Nilsson",
    "contributors": [{
      "name": "Rusck, Åke",
      "contributorNameTypeId": "2b94c631-fca9-4892-a730-03ee529ffe2a",
      "primary": true
    }, {
      "name": "Nilsson, Gösta",
      "contributorNameTypeId": "2b94c631-fca9-4892-a730-03ee529ffe2a",
      "primary": false
    }],
    "publication": [{
      "publisher": "Svenska Vattenkraftföreningen",
      "dateOfPublication": "1942"
    }],
    "staffSuppress": false,
    "discoverySuppress": false,
    "isBoundWith": true
  }, {
    "id": "ce9dd893-c812-49d5-8973-d55d018894c4",
    "title": "Rapport från inspektionsresa till svenska betongdammar i augusti 1939, med särskild hänsyn till sprickbildningsfrågan och användandet av specialcement / av S. Giertz-Hedström",
    "contributors": [{
      "name": "Giertz-Hedström, S.",
      "contributorNameTypeId": "2b94c631-fca9-4892-a730-03ee529ffe2a",
      "primary": true
    }],
    "publication": [{
      "publisher": "Svenska Vattenkraftföreningen",
      "dateOfPublication": "1942"
    }],
    "staffSuppress": false,
    "discoverySuppress": false,
    "isBoundWith": true,
    "items": [{
      "effectiveCallNumberComponents": {
        "callNumber": "1958 A 8050",
        "prefix": "A"
      },
      "effectiveShelvingOrder": "41958 A 48050"
    }, {
      "effectiveCallNumberComponents": {
        "callNumber": "DE3"
      },
      "effectiveShelvingOrder": "DE 13"
    }]
  }]
}

They all have the read-only instance property "isBoundWith": true. Doing the same for HRID a67207 in folio-dev shows isBoundWith":true as well. So, not sure why our icon doesn't appear.

shelleydoljack commented 2 years ago

As far as I can tell, it looks like based on this code https://github.com/folio-org/ui-inventory/blob/v9.0.12/src/components/InstancesList/InstancesList.js we should be seeing a bound-with icon next to the instance title in a search results list if "isBoundWith": true. The bound-with icon appears next to the instance title of the child record now. Best guess is that the indexing was behind.

shelleydoljack commented 2 years ago

@jermnelson for creating bound-with relationships, we need to get the holdings record UUID of the bound-with child and the item UUID of the bound-with parent from FOLIO. Looking up item UUID seems easy (use the barcode of the bound-with parent) but for the holdings record UUID, what data elements are needed? I am looking at https://github.com/sul-dlss/libsys-airflow/blob/main/plugins/folio/helpers/folio_ids.py but it is confusing. :) In Symphony, a BW-CHILD will have an item record with an "auto-generated" barcode that we could use to lookup a holdings record UUID in folio if we decide to create item records in folio for the bound-with parts (children). Otherwise, if we don't create item records in folio for the bound-with parts, then would we want call number, library, and home location to look up the folio holdings record UUID?

shelleydoljack commented 2 years ago

Per @ahafele we are not going to create item records in folio for bound-with children. I think there are two options we can do for the export from Symphony:

  1. we still create a file with the same info that would be in the .tsv file but instead name it something like .tsv.boundwiths
  2. add a column for ITEMCAT1 to the .tsv file.
If we add a column for ITEMCAT1, it makes me wonder if we should also add one for ITEMCAT2? @dlrueda what are your thoughts? Here is what that data looks like: category value description
ICT1 PC-FQTR Fall Qtr reserves PCs
ICT1 PC-WQTR Winter Qtr reserves PCs
ICT1 PC-SPQTR Spring Qtr reserves PCs
ICT1 PC-SUQTR Summer Qtr reserves PCs
ICT1 PC-FALLSEM Fall semester reserves PCs
ICT1 PC-SPSEM Spring semester reserves PCs
ICT1 PC-FWSQTRS
ICT1 PC-F-SPSEM Fall + Spring semester reserves PCs -- law
ICT1 PC-TOSS use only for weeding of unlinked item records
ICT1 WESTGOL4 WEST Gold print archive year 4 item
ICT1 W1SAL3 Unvalidated post-2005 items from WEST year 1 list to SAL3
ICT1 WESTBRO2 WEST Bronze print archive year 2 item
ICT1 WESTGOL2 WEST Gold print archive year 2 item
ICT1 WESTSIL2 WEST Silver print archive year 2 item
ICT1 MARCIVE MARCIVE loaded record
ICT1 RECYCLE marks records to be sent to vendors for copy matching
ICT1 W2SAL3 Unvalidated post-2005 items from WEST year 2 list to SAL3
ICT1 WESTSIL4 WEST Silver print archive year 4 item
ICT1 LEVEL3-CAT Used by SUL Cat.Dept. to identify Minimal Level 3 cataloging
ICT1 CATEVAL used for titles that cat. dept. decided not to cat as serial
ICT1 TEAMS Shadowed records created in Unicorn to support TEAMS
ICT1 BUSCORPRPT Business Library Corporate Reports at Iron Mt.
ICT1 WESTGOL3 WEST Gold print archive year 3 item
ICT1 WESTSIL3 WEST Silver print archive year 3 item
ICT1 WESTGOL6 WEST Gold print archive year 6 item
ICT1 SALPROB1 SAL barcoding problem - no SAL copy or record in Unicorn
ICT1 SALPROB2 SAL barcoding problem - may have same copy under two libs
ICT1 BW-PARENT For records that are bound-with parent records
ICT1 BW-CHILD For records that are bound-with child records
ICT1 M-MARCADIA Mark Music records that should go to Marcadia
ICT1 LEVEL3OCLC Level 3 record based on OCLC's level 3 standard
ICT1 PC-PERM Permanent Course Reserve items donated by Instructors
ICT1 E-THESIS Stanford electronic theses and dissertations
ICT1 EEM Everyday electronic material
ICT1 WESTBRO1 WEST Bronze print archive year 1 item
ICT1 WESTSIL1 WEST Silver print archive year 1 item
ICT1 WESTGOL1 WEST Gold print archive year 1 item
ICT1 WESTSIL5 WEST Silver print archive year 5 item
ICT1 WESTGOL5 WEST Gold print archive year 5 item
ICT1 WESTSIL6 WEST Silver print archive year 6 item
ICT1 WESTGOL7 WEST Gold print archive year 7 item
ICT1 WESTSIL7 WEST Silver print archive year 7 item
ICT1 WESTGOL8 WEST Gold print archive year 8 item
ICT1 WESTSIL8 WEST Silver print archive year 8 item
ICT1 WESTSIL9 WEST Silver print archive year 9 item
ICT1 WESTGOL9 WEST Gold print archive year 9 item
ICT2 TAX=8.25 taxable Santa Clara County
ICT2 NONTAXABLE non taxable
ICT2 DIGI-SENT item has been sent to digital scanning process
ICT2 UNKNOWN
ICT2 JUVENILE juvenile
ICT2 DIGI-SCAN item has been digitally scanned
ICT2 PHYSTOR Physics item designated for non-pageable storage
ICT2 PDIGI-SENT Physics DIGI-SENT item designated for non-pageable storage
ICT2 FED-WEED Code for running federal document weeding reports
ICT2 DIGI-SDR Item's digiital scan has been accessioned to SDR
dlrueda commented 2 years ago

Yes, we haven’t thought about it yet, but we will need to somehow map at least some of the info in item cat2 into FOLIO, so we might as well add it to the extract now

ahafele commented 2 years ago

I made a ticket on the data migration epic for itemcat2 stuff

shelleydoljack commented 2 years ago

@ahafele the bound-with child records have call numbers with volume information and our current mapping for volume information is to put it in the item record. Since we are not creating items in FOLIO for bw-children, should be map the volume info to the holdings record?

ahafele commented 2 years ago

Hmm, and I assume there could be extended info notes as well. Let me check with Vitus.

shelleydoljack commented 2 years ago

Let's try to put the BW-CHILD item info in a separate tsv file, we want to have this: CATKEY FORMAT CALL_NUMBER_TYPE BASE_CALL_NUMBER VOLUME_INFO BARCODE LIBRARY HOMELOCATION CURRENTLOCATION ITEM_TYPE Where BARCODE = the barcode of the BW-PARENT ckey_range.tsv.bw-child.tsv

ahafele commented 2 years ago

@shelleydoljack Vitus says put the bw child’s volume info in the bw child’s holdings record. But he did bring up the question of how this might impact SearchWorks. Not sure if we need to worry about that since SW will have to handle the new data model anyway.

shelleydoljack commented 2 years ago

@ahafele for the item extended info notes on the BW-CHILD, we could map to the different holdings notes, but what about multiple children with multiple item extended info notes? It doesn't seem like FOLIO prevents us from creating item records for the bound-with child. If we create items in FOLIO for the bound-with children, then we could preserve the item extended info notes with the correct item/copy.

ahafele commented 2 years ago

@shelleydoljack I'm chatting with Vitus on slack about this so copying his response here. tldr: bw child will only be one item. We shouldn't have scenarios in which multiple children and multiple item extended info notes would need to be merged into the same holdings.

"I am not quite able to envision Shelley's scenario. A bw child is by definition one item. I suppose you could bind two same children into the same physical volume, but that seems like an extremely rare situation. It is possible that the same volume has multiple copies, and one of them is a bw. In that case, by putting the volume info in the bw holdings record, we've created a unique holdings record. The other copies would have real item records with barcodes, etc. So, they won't be all linked to the same bw holdings record. So, how would multiple items merge into one bw holdings? In that model, can one bw holdings record be linked to multiple parent item records, or is it a one-to-one kind of thing?"

shelleydoljack commented 2 years ago

@dlrueda we had talked about the scenario where there are multiple BW-CHILD items on a single bib and what does the 590 look like? I did selitem -eBW-CHILD -oCB | cut -f1 -d"|" | sort | uniq -d and there are 3,260 BW-CHILD item records with the same ckey. For example: ckey 976304 has 4 items that are BW-CHILD that are bound to different parent records. This scenario will make figuring out the parent record's barcode difficult. The 590 data for ckey 976304 is:

C 13.46:553 bound with C 13.46:548. 13824553(parent record's ckey).
C 13.46:747 bound with C 13.46:744. 13826882(parent record's ckey).
C 13.46:881 bound with C 13.46:872. 13834484(parent record's ckey).
C 13.46:947 bound with C 13.46:939. 480816(parent record's ckey).

We will need to create 4 folio holdings records based on the uniqueness of the BW-CHILD item's call number + library + location. We will also have to use the two call numbers in the 590 note to find the BW-PARENT barcode. yuck. Eww, and these BW-PARENT ckeys 13824553 and 480816 have their own, other BW-CHILD items!

dlrueda commented 2 years ago

OMG

ahafele commented 1 year ago

I asked Vitus his opinion on what the Child's holdings locations should be and he agrees that we should follow the Lane model, but instead use a location like - Bound with child, see parent. We would need to set these up for each library though. @dlrueda does that sound ok to you? I guess we can wait and see what Implementers SIG says also.

shelleydoljack commented 1 year ago

Some data on the BW-CHILD bib records:

Per post-standup on 11/10/22 we want to write a script that will put the BW-PARENT barcode in the 590 $b, the way it is for about 2,000 590 notes in BW-CHILD records. Then in the generate marc and item tsv script for folio, another script would be used to update the barcode field in the items tsv of the BW-CHILD to what is in the 590 $b.

shelleydoljack commented 1 year ago

Just updating here the latest. There is a script in /s/SUL/OneTime/BoundWiths/update_bwchild_590.pl on bodoni that creates pipe-delim files of:

  1. child bib has 1 or more 590 notes and parent bib has only 1 BW-PARENT item
  2. child bib has 1 or more 590 notes and parent bib has more than 1 callnums with 1 BW-PARENT item
  3. child bib has 1 or more 590s notes and parent bib has more than 1 BW-PARENT items

For the first file, the "simple case", I will write more code to create editmarc files to use with edmarc_on_dir.pl (copied from https://github.com/sul-dlss/Eloader/blob/main/edmarc_on_dir.pl) to update the 590 fields of the child records with the parent barcode. The other 2 categories, we could have staff identify the right bw-parent in Symphony.

I should also mention that the script also creates pipe-delim files where the bwparent ckey is missing but the flexkey reference is still around and the bwparent ckey is missing but the item barcode reference is still around. These files will also need to be looked at by staff to either withdraw the bwchild record or fix the 590 note.

shelleydoljack commented 1 year ago

Log output for run 12/9/22 run on Bodoni:

shelleydoljack commented 1 year ago

The script generate_marc_items_tsv.ksh now calls find_bwparents.pl, which gets the parent barcode and library and puts it in the bwchild tsv. The script items_to_50K_ranges.pl updated to split the bwchild tsv into the corresponding ckey ranges. bwchild tsv filenames like ".tsv.bwchild.tsv".