scriptotek / mc2skos

Command line script for converting Marc21 Classification and Authority records to SKOS/RDF
The Unlicense
21 stars 4 forks source link

Add consistency check and inference with skosify #44

Open nichtich opened 6 years ago

nichtich commented 6 years ago

I am about to refactor Skosify to support use as module. We could integrate some of its functionality into mc2skos for instance to make sure that links have counterparts (related in both directions, broader/narrower...)

danmichaelo commented 6 years ago

Cool, I'm already using it as a package in one of my scripts: https://github.com/scriptotek/data_ub_tasks/blob/master/data_ub_tasks/data_ub_tasks.py#L211-L215

Of course one could just pipe data from mc2skos to skosify, but it takes quite a bit of time to serialize and deserialize large RDF files, so I'm open to adding e.g. a skosify consistency check within mc2skos.

nichtich commented 6 years ago

Skosify has many options that are don't needed so I would not support all of them. The following seem most useful in my opinion (see also https://seco.cs.aalto.fi/publications/2014/suominen-mader-skosquality.pdf):

I'd like to enable these most common checks with one or two options (e.g --expand and --quality) instead of having to create a config file so the choice must be opinionated. An additional option (--skosify configfile) could allow for all of Skosify features.

danmichaelo commented 6 years ago

👍 for the two options!

When it comes to supporting a config file, it would be good if the same format could also be used by skosify directly.

nichtich commented 6 years ago

First part implemented in #45. Packing Python and dependencies drives me nuts but I managed to do it.

P.S: Also added option --skosify.

nichtich commented 6 years ago

I'm not going to implement the --quality option soon because it requires https://github.com/NatLibFi/Skosify/issues/52 and can also be done with option --skosify to some degree. You can close this issue after merge.