Open jcateswellcome opened 1 month ago
Some notes from reviewing the conversation so far:
Relevant MARC fields:
Questions:
561 is missing here: https://wellcomecollection.org/works/yw9umag7 https://wellcomecollection.org/works/dwg8xwrc Why?
561 does appear here: https://wellcomecollection.org/works/u4hc2hwe Why?
561 is mentioned in the catalogue pipeline: https://github.com/search?q=repo%3Awellcomecollection%2Fcatalogue-pipeline+561+language%3AScala&type=code&l=Scala How does this result in 561 appearing on works?
Some terms of interest:
To answer:
561 is mentioned in the catalogue pipeline: https://github.com/search?q=repo%3Awellcomecollection%2Fcatalogue-pipeline+561+language%3AScala&type=code&l=Scala How does this result in 561 appearing on works?
Of the 3 relevant MARC fields mentioned by Collection Information, we only appear to extract information from MARC 561 and then only in specific circumstance. The logic for this is contained in the MarcNotes
class, here.
There is a helpful comment indicating:
/** Conditionally create an Ownership note, depending on the privacy of the
* original note.
*
* A note is only created if it is explicitly marked as "Not Private" by
* setting first indicator to 1
*
* https://www.loc.gov/marc/bibliographic/bd561.html
*/
Notes
are added as a top level attribute of a work and can be of a constrained variety of types listed here, e.g. "binding-detail", "exhibitions-note" & "ownership-note".
wellcomecollection.org will render there with labels provided by the catalogue API, see this example.
wellcomecollection.org: https://wellcomecollection.org/works/u4hc2hwe
API: https://api.wellcomecollection.org/catalogue/v2/works/u4hc2hwe?include=notes
Now we can look to answer:
561 is missing here: https://wellcomecollection.org/works/yw9umag7 https://wellcomecollection.org/works/dwg8xwrc Why?
yw9umag7
is sourced from b33389263
which is missing a value in indicator 1, so therefore private and not transformed onto a work.
dwg8xwrc
is sourced from b3347817xz [which contains a "1" value in indicator 1](https://reporting.wellcomecollection.org/s/sierra/app/r/s/s8Rmz), and the values "MS. annotations in Latin in two different hands., UkLW", so the ownership note "MS. annotations in Latin in two different hands
appears.
I believe subfield 5 (where the UkLW value is contained) is suppressed when transforming notes.
Some further questions to answer using the Sierra data available:
An addendum to this ^: I haven't found anywhere that we pay attention to any of these fields for item level records although there are item records with data in the 561 fields. Querying #collection-info to see if this is expected behaviour.
How many bib records have 561 fields with contents?
varField.marcTag : 561 AND parent.recordType : "bibs" [7892 hits]&_a=(columns:!(),filters:!(),index:d5ea4b8c-f58d-409c-8fb7-8a9973ea67f7,interval:auto,query:(language:kuery,query:'varField.marcTag%20:%20561%20AND%20parent.recordType%20:%20%22bibs%22%20'),sort:!()))
Of the bib records with 561 fields, how often is the first indicator set to 1 i.e. explicitly marked "not private".
varField.marcTag : 561 AND parent.recordType : "bibs" and varField.ind1 : "1" [5667 hits]&_a=(columns:!(),filters:!(),index:d5ea4b8c-f58d-409c-8fb7-8a9973ea67f7,interval:auto,query:(language:kuery,query:'varField.marcTag%20:%20561%20AND%20parent.recordType%20:%20%22bibs%22%20and%20varField.ind1%20:%20%221%22%20'),sort:!()))
Of these "not private" records:
https://reporting.wellcomecollection.org/s/sierra/app/r/s/W5o0e
Questions relating to the 541 field:
How many bib records have 541 fields with contents?
varField.marcTag : 541 AND parent.recordType : "bibs" [23718]
These breakdown like this:
https://reporting.wellcomecollection.org/s/sierra/app/r/s/TKTzT
Of the bib records with 541 fields, how often is the first indicator set to 1 i.e. explicitly marked "not private".
varField.marcTag : 541 AND parent.recordType : "bibs" and varField.ind1 : "1" [586]
Questions relating to the 591 field:
How many bib records have 591 fields with contents (and are there any patterns in the contents)?
varField.marcTag : 591 AND parent.recordType : "bibs" [149114]
These breakdown like this (this samples only 5000 records and this dataset is much larger):
Can we tell if the data in 591 fields should be private?
I can't! But #collections-info may be able to help work this out ... none of these fields have a value for indicator 1, but as this is a locally defined field perhaps there is a Wellcome Collection standard for understanding privacy here? Otherwise we can inspect the 591 data for patterns that we think match private data?
To summarise for now:
This is about as far as I can get using the reporting cluster data.
Slack thread with more context: https://wellcome.slack.com/archives/CGXDT2GSH/p1723802773006609
Background
Currently the data element 'ownership and custodial history' is not displaying on the works page for records from Sierra.
This is MARC element 561
Given the nature of the collection, there are a range of
Sample record: https://wellcomecollection.org/works/yw9umag7 does not contain much information, whereas the 561 contains a great deal of information relating to the history of the item as can be seen in the reporting cluster&_a=(columns:!(),filters:!(('$state':(store:appState),meta:(alias:!n,disabled:!f,field:parent.id,index:d5ea4b8c-f58d-409c-8fb7-8a9973ea67f7,key:parent.id,negate:!f,params:(query:'3338926'),type:phrase),query:(match_phrase:(parent.id:'3338926')))),index:d5ea4b8c-f58d-409c-8fb7-8a9973ea67f7,interval:auto,query:(language:kuery,query:'varField.marcTag%20:%22561%22%20'),sort:!()))
Consider how this transformation could be accommodated in the work model, for example can we map it as with archival collections which show an 'ownership note'.
This should be approached collaboratively, i.e. with collections information. This will help us better understand the user need and rationale for accessing this.
Part of: #5757