lcnetdev / marc2bibframe2

Convert MARC records to BIBFRAME2 RDF
http://www.loc.gov/bibframe/
Creative Commons Zero v1.0 Universal
88 stars 35 forks source link

Use subproperties of bf:language in 041 conversion #37

Open wafschneider opened 7 years ago

wafschneider commented 7 years ago

A suggestion from @osma in the discussion of #7:

Actually, I'd prefer a solution where the type of language would be specified using a set of properties, e.g. bf:originalLanguage, bf:summaryLanguage, bf:mainLanguage etc. corresponding to the available MARC 041 subfields - there are 11 of these currently defined. These properties could all be subproperties of a generic bf:language property. Then the values itself could be language entities from the LC languages vocabulary, as they are already. There would be no need to assert any additional properties for the language entities, so no need for any sort of indirection.

As a variation of this, the original language could (also?) be asserted as the main language for the work entity that represents the original work.

Here are some further arguments for the subproperty pattern I proposed above:

  1. It's simpler and reduces the number of triples and entities on the instance level, without losing any information.
  2. It's easier to query using SPARQL. For example, getting the summary language of a work is simply { ?work bf:summaryLanguage ?sl }, whereas with indirection it would be something like { ?work bf:language ?l . ?l bf:part "summary" ; bf:identifiedBy ?sl }. The simpler query would most likely be faster too.
  3. It's more semantic - the distinction between different types of languages in encoded using RDF properties instead of literals.
  4. It's language-agnostic and thus more suitable for internationalization. Labels can be defined for the new properties (like any properties in BIBFRAME) in any language using rdfs:label assertions, whereas the English language strings "original", "summary" etc. cannot easily be internationalized.

It occurs to me that these subproperties could be defined either in the BIBFRAME ontology proper, or perhaps more appropriately in bflc.