Open nichtich opened 7 years ago
Not that I'm aware of. Neither 100 in authority nor 153 in classification are repeatable. There was a discussion paper in 2001 on Multilingual Authority Records recommending separate records for each language. Interestingly it mentions something called "context markers" which could perhaps also be used to indicate language in a single-record approach, but I'm not sure what happened to that idea. There is mention of a follow-up paper to be prepared "for the midwinter 2002 meeting", but I haven't been able to find that (should have been here I guess).
I've seen model A in use as well. I think GND includes English terms in 4XX fields, but without any language marker, so that's not very optimal. We had to prepare a similar file to get our English terms searchable in Primo though. Not sure what the equivalent of 4XX would be in Marc21 classification.
Merging could be a feature. Not sure if it need to be part ofmc2skos
though, or if we can rely on some other RDF tool like riot? If the URIs are based on the classification number or some other common identifier, it should be easy enough to merge the RDF files afterwards, shouldn't it?
Thanks for the background and history. So to create multilingual KOS from MARC, multiple MARC files have to be converted and merged. Merging is easy in RDF but making sure that all input files align could cause problems. It may be more reliable to have one master file and additional translation files. The latter should only be used for string properties (skos:prefLabel
, skos:altLabel
, skos:scopeNote
, skos:editorialNote
, skos:historyNote
). My use case is to help get English translations into the RVK classification.
I think a good solution would be an option to only include string properties and a tool/guideline to merge KOS files.
$ mc2skos master.xml master.ttl
$ mc2skos --stringsOnly translation.xml translation.ttl
$ merge master.ttl translation.ttl > multilingual.ttl
Here merge
can be replaced by cat
for RDF/Turtle ([nd]json need other mechanism) but some additional checking would be better to make sure that the translation does not add any concepts not included in the master. Anyway this checking should better be put into another tool, e.g. skosify.
Is there a way to put labels in multiple languages into one MARC record, e.g. repeat field
153
? If not, should mc2skos provide a method to compare and merge multiple MARC files of the same classification in different languages?