bbcarchdev / spindle

RES Linked Open Data aggregation engine
https://bbcarchdev.github.io/spindle/
Apache License 2.0
2 stars 1 forks source link

skos:exactMatch is not used to correlate resources #101

Closed townxelliot closed 7 years ago

townxelliot commented 7 years ago

(This is a needs-info type issue rather than a bug.)

rulebase.ttl (https://github.com/bbcarchdev/spindle/blob/develop/twine/rulebase.ttl) has a section which shows the predicates used to create co-references between resources:

owl:sameAs spindle:coref spindle:resourceMatch .
<http://www.bbc.co.uk/ontologies/coreconcepts/sameAs> spindle:coref spindle:resourceMatch .
skos:exactMatch spindle:coref spindle:resourceMatch .
skos08:exactMatch spindle:coref spindle:resourceMatch .

This suggests that resources which have a skos:exactMatch relationship to another resource should be considered for potential correlation, e.g. I read this as meaning that if A owl:sameAs C and B skos:exactMatch C then A owl:sameAs B. However, this doesn't happen with the current develop branch. (I am aware that the semantics of owl:sameAs and skos:exactMatch differ, but as I understand it, they should be treated as equivalent by RES.)

Is this the intended reading of the rules in the rulebase? Or will correlation only happen between resources which are linked indirectly via owl:sameAs, e.g. A owl:sameAs C, B owl:sameAs C implies A owl:sameAs B? (This does happen on the current develop branch.)

(https://github.com/bbcarchdev/spindle/blob/develop/twine/correlate/README.md suggests that skos:exactMatch should be used to correlate resources.)

townxelliot commented 7 years ago

After some investigation, the fix is to add to the rulebase so that skos:exactMatch isn't stripped. This then allows it to be used for correlation.

nevali commented 7 years ago

Hmm, this looks like it might be a bug in spindle-strip? As skos:exactMatch is referenced explicitly as a matching predicate in the rulebase already, spindle-strip should have the smarts to not strip out those triples before they get as far as being used.

townxelliot commented 7 years ago

Do you want me to revert the fix I made which preserved the skos:exactMatch statements? I assumed that the matching rules were deliberately isolated from the spindle property mappings.

nevali commented 7 years ago

They are isolated, but shouldn't be as far as spindle-strip is concerned — leave the workaround in place, but we'll keep this issue open so that a proper fix can be applied.

rjpwork commented 7 years ago

Given this seems to be a fairly hefty show-stopper kind of bug then (if it's not parsing or interpreting its own rulebase correctly via spindle-strip) should we now consider the entire database of content suspect?

nevali commented 7 years ago

Ah, I see what's going on.

spindle-strip doesn't handle this properly because it doesn't actually have matching types, so those rules are purposefully not handled when it loads the rulebase (because the matching callback functions don't exist in the module - only in spindle-correlate).

However, their presence in the rulebase should still trigger cacheability, and that's the issue here.