lcnetdev / marc2bibframe2

Convert MARC records to BIBFRAME2 RDF
http://www.loc.gov/bibframe/
Creative Commons Zero v1.0 Universal
88 stars 35 forks source link

Incorrect URI for bf:Hub #225

Open RichardWallis opened 2 years ago

RichardWallis commented 2 years ago

Scope note: This issue has been identified in V1.7.0 & V1.7.1 - not checked in V2.0.0 because of inability to run - see issue #224

See following snippet of RDFXML:

    <bf:relatedTo>
      <bf:Hub rdf:about="http://id.loc.gov/authorities/names/no2004094953">
        <bf:contribution>
          <bf:Contribution>
            <bf:agent>
              <bf:Agent rdf:about="http://example.org/204092656#Agent700-28">
                <rdf:type rdf:resource="http://id.loc.gov/ontologies/bibframe/Person"/>
                <bflc:name00MatchKey>Lee, Dick, 1956-</bflc:name00MatchKey>
                <bflc:name00MarcKey>7001 $aLee, Dick,$d1956-$tOur Singapore.$0http://id.loc.gov/authorities/names/no2004094953$1http://viaf.org/viaf/267475219</bflc:name00MarcKey>
                <rdfs:label>Lee, Dick, 1956-</rdfs:label>
              </bf:Agent>
            </bf:agent>
            <bf:role>
              <bf:Role rdf:about="http://id.loc.gov/vocabulary/relators/ctb"/>
            </bf:role>
          </bf:Contribution>
        </bf:contribution>
        <bf:title>
          <bf:Title>
            <bflc:title00MatchKey>Our Singapore</bflc:title00MatchKey>
            <bflc:title00MarcKey>7001 $aLee, Dick,$d1956-$tOur Singapore.$0http://id.loc.gov/authorities/names/no2004094953$1http://viaf.org/viaf/267475219</bflc:title00MarcKey>
            <bf:mainTitle>Our Singapore</bf:mainTitle>
          </bf:Title>
        </bf:title>
      </bf:Hub>
    </bf:relatedTo>

For some reason the Hub entity has been given the URI of the 700 $0 value which is the authoritative URI of a Person. I could potentially understand the Agent entity getting that URI, but a Hub can't also be a Person.

For reference this is the source 700:

        <marc:datafield tag="700" ind1="1" ind2=" ">
            <marc:subfield code="a"><![CDATA[Lee, Dick,]]></marc:subfield>
            <marc:subfield code="d"><![CDATA[1956-]]></marc:subfield>
            <marc:subfield code="t"><![CDATA[Our Singapore.]]></marc:subfield>
            <marc:subfield code="0"><![CDATA[http://id.loc.gov/authorities/names/no2004094953]]></marc:subfield>
            <marc:subfield code="1"><![CDATA[http://viaf.org/viaf/267475219]]></marc:subfield>
        </marc:datafield>
RichardWallis commented 2 years ago

Still diagnosing but seem to be getting a similar problem with madsrdf:ComplexSubject entities being given the same URI as a the $0 of a Person or Organisation when it is one of the components of the ComplexSubject.

This all becomes very apparent when you load the resultant RDF into a triplestore and discover entities of type Person or Organization that are also of type Hub and/or ComplexSubject - types that should be disjoint.

ntra00 commented 2 years ago

Part of the problem is the $t making it seem like an analytic, not a name, but the conversion should not create a hub unless the ind2 is correct. We will look.
On the issue of $0 in 600 $a$x$y or similar, do you have multiple $0s or just 1, and if you have multiples , are you pairing them so they can be associated iwth the correct node ie., 600 $a $0 $x $0 $y $0 etc?

RichardWallis commented 1 year ago

Part of the problem is the $t making it seem like an analytic, not a name, but the conversion should not create a hub unless the ind2 is correct. We will look.

Great - look forward to the results of the look.

kefo commented 8 months ago

If this is still an issue, can we see some sample records please?