lcnetdev / marc2bibframe2

Convert MARC records to BIBFRAME2 RDF
http://www.loc.gov/bibframe/
Creative Commons Zero v1.0 Universal
88 stars 35 forks source link

bf:VariantTitle domain of bf:variantType (not bf:Title) #11

Closed dazza-codes closed 7 years ago

dazza-codes commented 7 years ago

It seems odd that some triples for a bf:Instance contain:

<bf:title>
     <bf:Title>
       <rdf:type rdf:resource="http://id.loc.gov/ontologies/bibframe/VariantTitle"/>
       <bf:variantType>portion</bf:variantType>
       <rdfs:label>Bleupsychiatriqueerotique</rdfs:label>
       <bf:mainTitle>Bleupsychiatriqueerotique</bf:mainTitle>
     </bf:Title>
</bf:title>

(This data might be from an instance with this LCCN:2007357865. It also has OCoLC-M:138280062 and OCoLC-I:275986835)

This is what I assumed it would look like:

<bf:title>
    <bf:VariantTitle>
       <bf:variantType>portion</bf:variantType>
       <bf:mainTitle>Bleupsychiatriqueerotique</bf:mainTitle>
       <rdfs:label>Bleupsychiatriqueerotique</rdfs:label>
    </bf:VariantTitle>
</bf:title>

The bf:variantType datatype property has a domain of bf:VariantTitle. As I understand things, I think this means that this datatype property cannot be asserted on any bf:Title, but only a bf:VariantTitle. (Or any subclass of variant title. As I understand it, a domain restriction cannot propagate up the class hierarchy, that would make a domain restriction useless, but it can propagate down the hierarchy.)

If this is correct, then it is not correct/valid to assert anything like this:

<bf:title>
    <bf:Title>
       <bf:variantType>portion</bf:variantType>
    </bf:Title>
</bf:title>

It must be:

<bf:title>
    <bf:VariantTitle>
       <bf:variantType>portion</bf:variantType>
    </bf:VariantTitle>
</bf:title>
rjyounes commented 7 years ago

Darren, this is legitimate:

portion but the domain specification then allows the inference that the Title in question is a VariantTitle. Domain and range restrictions, and ontology axioms generally, do not prevent assertions, but only provide inferences based on those assertions. In any case, note that in the RDF you show at the top the bf:Title is typed as a bf:VariantTitle, so this is not even an issue in that particular case.
dazza-codes commented 7 years ago

OK, thanks for clarification on the implications for inferences. So I wanted to concretely explore this behavior using a BF2 conversion for some MARC data with a variant title. In case anyone wants to try it, the data has:

    <bf:identifiedBy>
      <bf:Isbn>
        <rdf:value>9788890229145</rdf:value>
        <bf:acquisitionTerms>16.00 EUR</bf:acquisitionTerms>
      </bf:Isbn>
    </bf:identifiedBy>

So, the BF2 conversion contains:

    <bf:title>
      <bf:Title>
        <rdf:type rdf:resource="http://id.loc.gov/ontologies/bibframe/VariantTitle"/>
        <bf:variantType>cover</bf:variantType>
        <rdfs:label>Giorgione con Leonardo e le cpse d'amore</rdfs:label>
        <bflc:titleSortKey>Giorgione con Leonardo e le cpse d'amore</bflc:titleSortKey>
        <bf:mainTitle>Giorgione con Leonardo e le cpse d'amore</bf:mainTitle>
      </bf:Title>
    </bf:title>

So I edited that section of the RDF, to remove the rdf:type, as follows:

    <bf:title>
      <bf:Title>
        <bf:variantType>cover</bf:variantType>
        <rdfs:label>Giorgione con Leonardo e le cpse d'amore</rdfs:label>
        <bflc:titleSortKey>Giorgione con Leonardo e le cpse d'amore</bflc:titleSortKey>
        <bf:mainTitle>Giorgione con Leonardo e le cpse d'amore</bf:mainTitle>
      </bf:Title>
    </bf:title>

Then loaded the data into a triple store graph with or without inference features enabled. I then ran some SPARQL on it to return the main title, without any variant title data, i.e.

prefix bf: <http://id.loc.gov/ontologies/bibframe/>
prefix mads: <http://www.loc.gov/mads/rdf/v1#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT ?i ?t ?p ?o
WHERE {
  ?i a bf:Instance ;
       bf:title ?t .
  ?t ?p ?o .
  FILTER NOT EXISTS { ?t a bf:VariantTitle }
}
ORDER BY ?t

For the graph without any inference enabled, it did not exclude the variant title (because it did not infer that the domain of the bf:variantType is a bf:VariantTitle and the expicit type declaration was deleted). For the graph with inference enabled, it did exclude the variant title data in the result set. When loading the data into these graphs, they did not complain about any data/model inconsistencies.

I guess I'm satisfied that this is not a conversion error, it's just my mistake in understanding the implications of inference (and/or RDF validation).

dazza-codes commented 7 years ago

For anyone landing here because they are working on filtering BF2 titles, there is additional commentary on a related issue at https://github.com/sul-dlss/sparql_to_sw_solr/issues/16