scriptotek / mc2skos

Command line script for converting Marc21 Classification and Authority records to SKOS/RDF
The Unlicense
21 stars 4 forks source link

Missing broader relations from GND #62

Open nichtich opened 6 years ago

nichtich commented 6 years ago

The GND record https://d-nb.info/040034232/about/marcxml contains field 550 with multiple $0 and $4 which are interpreted differently:

    <datafield tag="550" ind1=" " ind2=" ">
      <subfield code="0">(DE-101)965844773</subfield>
      <subfield code="0">(DE-588)4711780-1</subfield>
      <subfield code="0">http://d-nb.info/gnd/4711780-1</subfield>
      <subfield code="a">Moderne Physik</subfield>
      <subfield code="4">obal</subfield>
      <subfield code="4">http://d-nb.info/standards/elementset/gnd#broaderTermGeneral</subfield>
      <subfield code="w">r</subfield>
      <subfield code="i">Oberbegriff allgemein</subfield>
    </datafield>

In mc2skos 0.11.0 this is ignored. The documentation says "$4 must precede $0 (since both subfields can be repeated)" but this is not true. I found a related bug in the code but it will only work if URIs are preferred when subfield $0 (and/or $4) is repeated. This will work (subfields to be ignored commented out):

    <datafield tag="550" ind1=" " ind2=" ">
      <!--subfield code="0">(DE-101)965844773</subfield-->
      <!--subfield code="0">(DE-588)4711780-1</subfield-->
      <subfield code="0">http://d-nb.info/gnd/4711780-1</subfield>
      <subfield code="a">Moderne Physik</subfield>
      <!---subfield code="4">obal</subfield-->
      <subfield code="4">http://d-nb.info/standards/elementset/gnd#broaderTermGeneral</subfield>
      <subfield code="w">r</subfield>
      <subfield code="i">Oberbegriff allgemein</subfield>
    </datafield>