scriptotek / mc2skos

Command line script for converting Marc21 Classification and Authority records to SKOS/RDF
The Unlicense
21 stars 4 forks source link

Map 084 to concept scheme and namespace #5

Closed danmichaelo closed 8 years ago

danmichaelo commented 8 years ago

084 gives classification scheme and edition:

  <mx:datafield tag="084" ind2=" " ind1="0">
    <mx:subfield code="a">ddc</mx:subfield>
    <mx:subfield code="c">23no</mx:subfield>
    <mx:subfield code="e">nob</mx:subfield>
  </mx:datafield>

We could perhaps have a config file that contains a map from the 084 values to namespace, scheme, etc.:

{
    "classification_schemes":
    {
        "ddc": {
            "23no": {
                 "uri": "http://data.ub.uio.no/ddc/{class_no}",
                 "scheme": "http://data.ub.uio.no/ddc/",
                 "sameas": ["http://dewey.info/class/{class_no}/e23/"]
        }
    }
}

while still allowing command line arguments to override the config file values.

nichtich commented 8 years ago

The given example should hard-coded to map skos_scheme to http://dewey.info/scheme/edition/e23/ unless overriden with option skos_scheme. The other examples (ddc21en-*.xml and ddc23de-*) map to http://dewey.info/scheme/edition/e21/ and http://dewey.info/scheme/e23/ as well. To only affects skos:inScheme and skos:topConceptOf.

By the way, could you a sample record of Norwegian DDC to directory examples?

danmichaelo commented 8 years ago

Sample record added in 514f0fd211db8e51a4ba5fc6ef4c0d11b9dbd952

Would be nice to also have an example of some other classification scheme than dewey.

Anyways, my configuration template is too simple to support table schemes ( skos:inScheme <http://dewey.info/table/2/>) like mentioned in #2.

danmichaelo commented 8 years ago

I've begun refactoring the code to make it somewhat easier to maintain. Since there's many editions of ddc, it seems easier to have one template for all editions, so using the terminology from the Mitchell and Panzer article, I now have:

default_uri_templates = {
    "ddc": {
        "uri": "http://dewey.info/{collection}/{object}/e{edition}/"
    }
}

that will be used for all records with 084 $a = "ddc"

danmichaelo commented 8 years ago

Added in https://github.com/scriptotek/mc2skos/commit/84513153e7961effc183464441f7ef67bccff8aa