uwlib-cams / rml

Using the RML Mapper (https://github.com/RMLio/rmlmapper-java) to convert RDA data to BIBFRAME.
Creative Commons Zero v1.0 Universal
5 stars 2 forks source link

RDA/RDF Manifestation data on classification not in BIBFRAME Instance data #31

Closed gerontakos closed 3 years ago

gerontakos commented 3 years ago

Our RDA/RDF Instance data sometimes includes classification data as follows:

<> ns2:hasLcClassificationPartA "PZ7.D664"@eng ; ns2:hasLcClassificationPartB "Han 1929"@eng .

This does not get transformed into BIBFRAME Instance data. ...probably because it is "not mapped" in the "M" tab of our rda2bf_rdaExtension sheet.

Noteworthy: the same data exists in RDA/RDF Item data and it does get transformed into BIBFRAME Item data.

briesenberg07 commented 3 years ago

Added mappings to rda2bf_rdaExtension > M tab.

Note that the "need-to-place-multiple-values-within-the-same-blank-node-in-converted-BIBFRAME" issue arises here. For example an rda:Manifestation may have:

<> <https://doi.org/10.6069/uwlib.55.d.4#hasLcClassificationPartA> "BMR01IDK" ;
     <https://doi.org/10.6069/uwlib.55.d.4#hasLcClassificationPartB> "IDKINAC" .

And output bf:Instance should have:

<> bf:classification [ a bf:ClassificationLcc ;
     bf:ClassificationPortion "BMR01IDK" ;
     bf:itemPortion "IDKINAC" ] .

Same applies to:

Thought this worth mentioning to @mcm104 because I believe that a mechanism exists to combine such values within a given bnode (but I don't know exactly where it is in the code).

mcm104 commented 3 years ago

I'm having an issue with these properties in manifestations...

It's easy enough to have Part A and Part B in the same blank node like in Ben's example, IF there are is only one classification number for a given resource. I didn't notice any problems with this for items, but I was just testing out some manifestations and I immediately ran into an example that had multiple values for hasLcClassificationPartA/B.

From 0010d9b0-ec29-4ab0-b25a-9a1a87108dd2.xml:

<rdf:Description rdf:about="https://api.sinopia.io/resource/0010d9b0-ec29-4ab0-b25a-9a1a87108dd2">
    <rdax:hasLcClassificationPartA xml:lang="zxx">PG1419.22.A34</rdax:hasLcClassificationPartA>
    <rdax:hasLcClassificationPartB xml:lang="zxx">M33 2003</rdax:hasLcClassificationPartB>
    <rdax:hasLcClassificationPartA xml:lang="zxx">PZ70.S42</rdax:hasLcClassificationPartA>
    <rdax:hasLcClassificationPartB xml:lang="zxx">L356 2003</rdax:hasLcClassificationPartB>
</rdf:Description>

When we have multiple part As and Bs, our combined blank nodes end up looking like this:

<rdf:Description rdf:nodeID="N2b9cd179d5e84915a1b69c7b14729350">
    <rdf:type rdf:resource="http://id.loc.gov/ontologies/bibframe/ClassificationLcc"/>
    <bf:classificationPortion>PG1419.22.A34</bf:classificationPortion>
    <bf:itemPortion>M33 2003</bf:itemPortion>
    <bf:classificationPortion>PZ70.S42</bf:classificationPortion>
    <bf:itemPortion>L356 2003</bf:itemPortion>
</rdf:Description>

This seems problematic to me because you can't tell which A goes with which B -- but then again, you can't tell that from the original RDA, either!

The only way to separate these would be to separate all of them, i.e.:

<rdf:Description rdf:nodeID="bnode1">
    <rdf:type rdf:resource="http://id.loc.gov/ontologies/bibframe/ClassificationLcc"/>
    <bf:classificationPortion>PG1419.22.A34</bf:classificationPortion>
</rdf:Description>
<rdf:Description rdf:nodeID="bnode2">
    <rdf:type rdf:resource="http://id.loc.gov/ontologies/bibframe/ClassificationLcc"/>
    <bf:itemPortion>M33 2003</bf:itemPortion>
</rdf:Description>
<rdf:Description rdf:nodeID="bnode3">
    <rdf:type rdf:resource="http://id.loc.gov/ontologies/bibframe/ClassificationLcc"/>
    <bf:classificationPortion>PZ70.S42</bf:classificationPortion>
</rdf:Description>
<rdf:Description rdf:nodeID="bnode4">
    <rdf:type rdf:resource="http://id.loc.gov/ontologies/bibframe/ClassificationLcc"/>
    <bf:itemPortion>L356 2003</bf:itemPortion>
</rdf:Description>

which doesn't seem any better.

Is this just a mistake in the original RDA, or is it something that needs to be accounted for? @gerontakos @briesenberg07 @CECSpecialistI @JianPLee

gerontakos commented 3 years ago

My opinion: put this "on the back burner" but it's something we'll have to deal with at some point. I would say just output it all into a single bnode and cry yourself to sleep for now. It's a problem with our extension. One thing we can do: change the range of the properties so that the expected value is a node; then create (in our extension) the required class to type the node, and keep class numbers together as a distinct resource. Does anyone have a better idea?

briesenberg07 commented 3 years ago

This is interesting! I would think our first step here would be to find out why an rda:Manifestation might have two classification part As and/or classification part Bs.

I would hold off on adding any additional terms to the UW RDA Extension vocab to accommodate this data until we understand the data better. I suspect that there is a (MARC?) cataloging practice behind these multiple classification values that we (or at least I) don't quite understand?

mcm104 commented 3 years ago

I think my concerns may have been premature -- it looks like the example I posted (which just happened to be the first one I checked) is our only manifestation to have this issue. I would assume that means it's an error, but either way, it seems ignorable.

mcm104 commented 3 years ago

Classification properties now included in RML for manifestations.

JianPLee commented 3 years ago

It is not unusual for a bib record to have two call numbers. Sometimes if a cataloger does not agree with the call number that's already there, the cataloger can add another one.