scriptotek / mc2skos

Command line script for converting Marc21 Classification and Authority records to SKOS/RDF
The Unlicense
21 stars 4 forks source link

Handling of 072 #30

Open danmichaelo opened 7 years ago

danmichaelo commented 7 years ago

The NAL thesaurus uses 072, which does not seem to be mapped by the marcauth-2-madsrdf tool. In their own conversion, they convert these to concept schemes:

<skos:ConceptScheme rdf:about="http://lod.nal.usda.gov/nalt//P">
    <rdfs:label xml:lang="en">Natural Resources, Earth and Environmental Sciences</rdfs:label>
    <rdfs:label xml:lang="es">Tierra, Ambiente y Recursos Naturales</rdfs:label>
    <skos:hasTopConcept rdf:resource="http://lod.nal.usda.gov/nalt//1556"/>
         ...
</skos:ConceptScheme>

but I wonder if skos:collection is more appropriate.

In the MARCXML, a member of the collection "P Natural Resources, Earth and Environmental Sciences" looks like this:

 <marc:record>
      <marc:leader>00664nz  a2200205n  4500</marc:leader>
      <marc:controlfield tag="003">DNAL</marc:controlfield>
      <marc:controlfield tag="005">20161208094706.0</marc:controlfield>
      <marc:controlfield tag="008">161208 neazdnnbabn           a ana      </marc:controlfield>
      <marc:datafield tag="016" ind1="7" ind2=" ">
         <marc:subfield code="a">nalt00276029</marc:subfield>
         <marc:subfield code="2">DNAL</marc:subfield>
      </marc:datafield>
      <marc:datafield tag="035" ind1=" " ind2=" ">
         <marc:subfield code="a">(DNAL) nalt00276029</marc:subfield>
      </marc:datafield>
      <marc:datafield tag="040" ind1=" " ind2=" ">
         <marc:subfield code="a">DNAL</marc:subfield>
         <marc:subfield code="c">DNAL</marc:subfield>
      </marc:datafield>
      <marc:datafield tag="072" ind1=" " ind2=" ">
         <marc:subfield code="a">P Natural Resources, Earth and Environmental Sciences</marc:subfield>
      </marc:datafield>
      <marc:datafield tag="150" ind1=" " ind2=" ">
         <marc:subfield code="a">necromass</marc:subfield>
      </marc:datafield>
      <marc:datafield tag="550" ind1=" " ind2=" ">
         <marc:subfield code="a">biological resources</marc:subfield>
         <marc:subfield code="w">g</marc:subfield>
      </marc:datafield>
      ...
      <marc:datafield tag="750" ind1=" " ind2="7">
         <marc:subfield code="a">necromasa</marc:subfield>
         <marc:subfield code="0">tesa00276029</marc:subfield>
         <marc:subfield code="2">TESA</marc:subfield>
      </marc:datafield>
   </marc:record>

and the collection itself:

 <marc:record>
      <marc:leader>01004nz  a2200325n  4500</marc:leader>
      <marc:controlfield tag="001">54870</marc:controlfield>
      <marc:controlfield tag="003">DNAL</marc:controlfield>
      <marc:controlfield tag="005">20161208094706.0</marc:controlfield>
      <marc:controlfield tag="008">161208 neazdnnbabn           a ana      </marc:controlfield>
      <marc:datafield tag="016" ind1="7" ind2=" ">
         <marc:subfield code="a">nalt00127305</marc:subfield>
         <marc:subfield code="2">DNAL</marc:subfield>
      </marc:datafield>
      <marc:datafield tag="035" ind1=" " ind2=" ">
         <marc:subfield code="a">(DNAL) nalt00127305</marc:subfield>
      </marc:datafield>
      <marc:datafield tag="040" ind1=" " ind2=" ">
         <marc:subfield code="a">DNAL</marc:subfield>
         <marc:subfield code="c">DNAL</marc:subfield>
      </marc:datafield>
      <marc:datafield tag="072" ind1=" " ind2=" ">
         <marc:subfield code="a">P Natural Resources, Earth and Environmental Sciences</marc:subfield>
      </marc:datafield>
      <marc:datafield tag="150" ind1=" " ind2=" ">
         <marc:subfield code="a">Natural Resources, Earth and Environmental Sciences</marc:subfield>
      </marc:datafield>
      <marc:datafield tag="550" ind1=" " ind2=" ">
         <marc:subfield code="a">atmospheric sciences</marc:subfield>
         <marc:subfield code="w">h</marc:subfield>
      </marc:datafield>
      ...
      <marc:datafield tag="750" ind1=" " ind2="7">
         <marc:subfield code="a">Tierra, Ambiente y Recursos Naturales</marc:subfield>
         <marc:subfield code="0">tesa00127305</marc:subfield>
         <marc:subfield code="2">TESA</marc:subfield>
      </marc:datafield>
   </marc:record>

But how do we identify the latter as a collection?

danmichaelo commented 6 years ago

Some of the Finto vocabularies also use 072: https://github.com/NatLibFi/Finto-data/blob/5789359f871f9edb269963e4c1f809e7b7f46bf7/tools/oai-pmh-to-skos/oai-pmh-to-skos.py#L202-L207