NatLibFi / bib-rdf-pipeline

Scripts and configuration for converting MARC bibliographic records into RDF
Creative Commons Zero v1.0 Universal
29 stars 5 forks source link

Non-authorized persons not merged with authorized ones #77

Closed osma closed 6 years ago

osma commented 6 years ago

For example "Abckiria" (W00097789600) shows two authors called "Agricola, Mikael, noin 1510-1557". One of them authorized the other one not. This is a consequence of 3717fe701e3cfee66931e88974a2d9313973d103 which fixed a crash but did it in a way that prevents merging of authorized and non-authorized persons.

osma commented 6 years ago

Actually this seems to be a missing feature in marc2bibframe2, it doesn't extract the person ID from 600 $0 when the 600 field represents a work ($t used) so although the authorized person ID is in the data for this record, it doesn't end up in the BIBFRAME data:

006128684 60014 L $$aAgricola, Mikael,$$dnoin 1510-1557.$$tAbckiria.$$0(FIN11)000103346

result:

    <bf:subject>
      <bf:Work rdf:about="http://urn.fi/URN:NBN:fi:bib:me:006128684#Work600-27">
        <rdf:type rdf:resource="http://www.loc.gov/mads/rdf/v1#NameTitle"/>
        <madsrdf:authoritativeLabel>Agricola, Mikael, noin 1510-1557. Abckiria.</madsrdf:authoritativeLabel>
        <bf:contribution>
          <bf:Contribution>
            <bf:agent>
              <bf:Agent rdf:about="http://urn.fi/URN:NBN:fi:bib:me:006128684#Agent600-27">
                <rdf:type rdf:resource="http://id.loc.gov/ontologies/bibframe/Person"/>
                <bflc:name00MatchKey>Agricola, Mikael, noin 1510-1557.</bflc:name00MatchKey>
                <bflc:name00MarcKey>60014$aAgricola, Mikael,$dnoin 1510-1557.$tAbckiria.$0(FIN11)000103346</bflc:name00MarcKey>
                <rdfs:label>Agricola, Mikael, noin 1510-1557.</rdfs:label>
              </bf:Agent>
            </bf:agent>
            <bf:role>
              <bf:Role rdf:about="http://id.loc.gov/vocabulary/relators/ctb"/>
            </bf:role>
          </bf:Contribution>
        </bf:contribution>
        <rdfs:label>Abckiria.</rdfs:label>
        <bf:title>
          <bf:Title>
            <bflc:title00MatchKey>Abckiria.</bflc:title00MatchKey>
            <bflc:title00MarcKey>60014$aAgricola, Mikael,$dnoin 1510-1557.$tAbckiria.$0(FIN11)000103346</bflc:title00MarcKey>
            <rdfs:label>Abckiria.</rdfs:label>
            <bflc:titleSortKey>Abckiria.</bflc:titleSortKey>
            <bf:mainTitle>Abckiria</bf:mainTitle>
          </bf:Title>
        </bf:title>
        <bf:identifiedBy>
          <bf:Identifier>
            <rdf:value>000103346</rdf:value>
            <bf:source>
              <bf:Source>
                <rdfs:label>FIN11</rdfs:label>
              </bf:Source>
            </bf:source>
          </bf:Identifier>
        </bf:identifiedBy>
      </bf:Work>
    </bf:subject>

Note there is no identifier for the Agent.

Merging non-authorized persons with authorized problems would fix this, and is probably worthwhile, but it would be better to address the underlying issue in marc2bibframe2.

osma commented 6 years ago

Actually the identifier is there in the BIBFRAME output, but it is represented as the identifier of the subject Work, not the Agent.

osma commented 6 years ago

Asked on the BIBFRAME mailing list. Meanwhile, I will try to merge non-authorized and authorized persons.