SISS / SISSVoc

SISSVoc is a Linked Data API for accessing published vocabularies.
http://www.sissvoc.info/
GNU Lesser General Public License v3.0
4 stars 2 forks source link

concept?labelcontains and concept?anylabel endpoints don't limit to skos:Concepts #14

Open rwalkerands opened 5 years ago

rwalkerands commented 5 years ago

The concept?labelcontains endpoint doesn't limit its results to resources that are skos:Concepts.

The results include resources of any type, if they have an rdfs:label.

Commit df181c6b592adc9d99100cb15bb1cd67181997ae removed a lot of vocabulary-specific configs that have a selector something like this:

; api:selector [
    api:where " ?item a skos:Concept . ?item ?label ?l . FILTER ( ?label = skos:prefLabel || ?label = skos:altLabel ) FILTER regex( str(?l) , ?text , 'i' ) "
]

But the remaining generic config SISSvoc3-ELDAConfig-template.ttl defines the concept?labelcontains endpoint with this selector:

; api:selector [
    api:where
      """ { ?item skos:prefLabel ?l }
        UNION
        { ?item skos:altLabel ?l }
        UNION
        { ?item rdfs:label ?l }
           FILTER regex( str(?l) , ?text , 'i' ) """
]

How come the selector doesn't include ?item a skos:Concept?

The same question applies also to the concept?anylabel endpoint.

jyucsiro commented 5 years ago

Thanks for the question - it's an interesting point. The selector at the moment doesn't constrain to skos:Concept but given that the endpoint is concept?labelcontains or concept?anylabel, maybe it should.

@dr-shorthair - what do you think if we change this to include constraining this API call to skos:Concept?

dr-shorthair commented 5 years ago
  1. skos:altLabel, skos:prefLabel and skos:hiddenLabel are all sub-properties of rdfs:label so a suitably configured SPARQL endpoint (i.e. with RDFS entailment enabled) would find them anyway

  2. since the global domain of skos:*Label is not specified, they can be used on resources of any type. Most of SKOS is like this - convenience properties with no potentially annoying entailments. (I use skos:scopeNote in lots of places for example with nothing else to do with SKOS).

Thus, while skos:Concept is the most common resource type in a SISSVoc service, it is not strictly required (though obviously all the /Concept and /Collection queries would only work with SKOS).

Since this API call is concept?labelcontains it would make sense to add the additional selector. However, we might want to also add a more general case resource?labelcontains etc without that selector?

rwalkerands commented 5 years ago

(We currently encounter this issue in practice, due to the way PoolParty adds metadata for concept schemes. Sample URL: http://vocabs.ands.org.au/repository/api/lda/abares/australian-land-use-and-management-classification/version-8/concept?labelcontains=Australian This should return no results, but it returns the ConceptScheme instead. Another example of the same thing: http://vocabs.ands.org.au/repository/api/lda/ga/protocol-type/v1-2/concept?labelcontains=Protocol )

rwalkerands commented 5 years ago

The issue as I see it is:

I have no problem with a solution that includes preserving the current behaviour of these endpoints, as long as they're not called .../concept..., e.g., by renaming the endpoint URLs to say .../resource....

dr-shorthair commented 5 years ago

Yes - I agree - if the API call (i.e. the endpoint URL) contains /concept then the result set should be limited to resources of type skos:Concept. So adding the additional graph-pattern to the selector would be best.

What I'm suggesting is to also add endpoints with /resource in place of /concept with the current selector - i.e. with no resource type specified.

jyucsiro commented 5 years ago

Something like #15 ?

rwalkerands commented 5 years ago

Something like that.

Do you have a reason for doing it in a different way from the earlier version (see original comment with the query from commit df181c6, but noting that that query doesn't include matches against rdfs:labels). At first glance, your version looks up Concepts three times instead of just once. But maybe the query optimizer notices that. (I have no idea if it does hoisting of that sort.)

dr-shorthair commented 5 years ago

Which comment? (I don't see any - is it embedded in one of the TTL files?)

rwalkerands commented 5 years ago

By "original comment" I meant "the first comment in this GitHub issue", i.e., https://github.com/SISS/SISSVoc/issues/14#issue-421992783

dr-shorthair commented 5 years ago

Ah - your concern is not so much /concept vs /resource, but the efficiency of the SPARQL query.

I'm not sure that the one you quote makes a whole lot of sense (can't see how these combos quite work) but indeed, { ... } UNION { ... } type queries are expensive, and FILTER ( ... || ... ) style might be less expensive. That's an implementation detail, but possibly one that's worth attention.