CLARIAH / grlc

grlc builds Web APIs using shared SPARQL queries
http://grlc.io
MIT License
135 stars 32 forks source link

Default value for optional paramater #347

Closed rajaram5 closed 3 years ago

rajaram5 commented 3 years ago

Is it possible to set a default value for the optional param? I am trying to create an API where country is an optional parameter. In my SPARQL query I want to return all the results if the optional parmeter value is not provided by the request. But when I do a request like this http://myapi.com/search?code=http%3A%2F%2Fwww.orpha.net%2FORDO%2FOrphanet_98056 I see that no value is attached to the ?__country variable and my filter logic fails. Is there a way to attach default value to the optional param? Please see my SPARQL query below. Am I missing anything?

#+ defaults:
#+   - code: http://www.orpha.net/ORDO/Orphanet_98056
#+   - _country: NA

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>

SELECT DISTINCT ?country_name ?country_code WHERE {

    ?publisher_location skos:relatedMatch ?wiki_data_uri.
    ?wiki_data_uri rdfs:label ?country_name;
                   wdt:P297 ?country_code.

    FILTER(regex(str(?__country), ?country_code) || str(?__country) = "NA")
}
c-martinez commented 3 years ago

Hi @rajaram5 :wave:,

It looks like you are running into the same issue described in #322. The proposed workaround seems to work for your query:

#+ endpoint: "https://query.wikidata.org/sparql"
#+ defaults:
#+   - country: NA

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>

SELECT DISTINCT ?country_name ?country_code WHERE {
    OPTIONAL { ?s ?p ?_country }

    ?wiki_data_uri rdfs:label ?country_name;
                   wdt:P297 ?country_code.

    FILTER(regex(str(?_country), ?country_code) || str(?_country) = "NA")
    FILTER (lang(?country_name) = 'en')
}

Hope this solves your issue?

rajaram5 commented 3 years ago

Hi @c-martinez, thanks for your reply. I tried the above solution it worked. However the response time of API increase a lot and when I have more then one optional parameter in my API some case I hit a timeout error. I tired the solution below but it didn't work :-(. Any suggestion to improve the response time?

#+ endpoint: "https://query.wikidata.org/sparql"
#+ defaults:
#+   - country: NA

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>

SELECT DISTINCT ?country_name ?country_code WHERE {
    {SELECT ?__country {OPTIONAL { ?s ?p ?__country }} LIMIT 1 }

    ?wiki_data_uri rdfs:label ?country_name;
                   wdt:P297 ?country_code.

    FILTER(regex(str(?_country), ?country_code) || str(?_country) = "NA")
    FILTER (lang(?country_name) = 'en')
}
c-martinez commented 3 years ago

Hi @rajaram5! Do you have an example of a query with more than one optional parameter and which times out? I can look into it.

The query you tried may not work because ?__country (double underscore) and ?_country (single underscore) are going to look different to the query parser. You can replace your inner SELECT clause for this, and then it should work:

{SELECT ?s {OPTIONAL { ?s ?p ?_country }} LIMIT 1 }
rajaram5 commented 3 years ago

I get timeout for the query below when I don't provide values for both the optional parameters. But if I provide value for one of the optional parameter then the query works but takes quite some time. I also setup a grlc server for this query. For testing the query please use values http://www.orpha.net/ORDO/Orphanet_98056, IT, PatientRegistryDataset for the code, country and resourceType parameters. BTW, both code and country parameters are optional so the API request is expected to work even if one don't provide values for these parameters.

#+ endpoint: http://ejprd.fair-dtls.surf-hosted.nl:7200/repositories/ordo-catalog-fdp
#+ endpoint_in_url: False
#+ description: Get resources for a given ordo URL.
#+ defaults:
#+   - code: http://www.orpha.net/ORDO/Orphanet_98056
#+   - _country: NA
#+   - _resourceType: NA
#+ transform: {
#+     "apiVersion": "v0.2",
#+     "resourceResponses": {"id": "?id", "name": "?title", "type": "?resource_type_label", "description": "?description", "homepage": "?homepage", "publisher": { "id": "Orphan$
#+     "$anchor": "apiVersion"
#+   }

PREFIX dcat: <http://www.w3.org/ns/dcat#>
PREFIX ejp: <http://purl.org/ejp-rd/vocabulary/>
PREFIX dcterm: <http://purl.org/dc/terms/>
PREFIX fdo: <http://rdf.biosemantics.org/ontologies/fdp-o#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>

SELECT DISTINCT ?resource ?id ?title ?description ?resource_type_label ?homepage ?country_name ?country_code WHERE {

    # Query external triple store to enrich our input disease list
    SERVICE <http://ejprd.fair-dtls.surf-hosted.nl:7200/repositories/ordo> {
        # GET all the sub diseases of our input disease class
        ?ordo_class_iri  rdfs:subClassOf* ?_code_iri;
                         rdfs:label ?disease_name.
    }

    ?resource a ?type ;
             dcat:theme ?_code_iri;
             dcterm:description ?description;
             dcterm:title ?title;
             dcterm:publisher [dcterm:spatial ?publisher_location];
             dcat:landingPage ?homepage;
             fdo:metadataIdentifier [ dcterm:identifier ?id].

   ?type skos:altLabel ?resource_type_label .

   OPTIONAL {?rs ?rp ?__resourceType}

    FILTER(regex(str(?__resourceType), ?resource_type_label) || str(?__resourceType) = "NA")

    ?publisher_location skos:relatedMatch ?wiki_data_uri.

    ?wiki_data_uri rdfs:label ?country_name;
                   wdt:P297 ?country_code.

    OPTIONAL {?s ?p ?__country}             

    FILTER(regex(str(?__country), ?country_code) || str(?__country) = "NA")

}
rajaram5 commented 3 years ago

BTW, about double underscore isn't it mandatory if you want to indicate to grlc server that a parameter is optional?

c-martinez commented 3 years ago

Hi @rajaram5 -- I'm trying to replicate your timeout issue. But it seems like the SPARQL endpoint is returning an exception:

java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: Service URI http://ejprd.fair-dtls.surf-hosted.nl:7200/repositories/ordo is not allowed

I get the feeling that the queries grlc is sending to the endpoint are somehow not being understood.

c-martinez commented 3 years ago

After a bit of offline discussion, we got to the bottom of the issue. It looks like the expression OPTIONAL {?s ?p ?__country} gets expanded by the triple store (GraphDB in this case) while processing the query. We've tried this trick using Virtuoso in the past, and it worked well then, so this might be dependent on the triple store in question.

Anyway, the solutions was to limit the number of triples that match this expression by constraining ?__country to be of type wdt:P297:

OPTIONAL {?s wdt:P297 ?__country}

Because the query above has two OPTIONAL clauses, we also constrained ?__resourceType to be of type skos:altLabel. The final query looks like this:

#+ endpoint: http://ejprd.fair-dtls.surf-hosted.nl:7200/repositories/ordo-catalog-fdp
#+ endpoint_in_url: False
#+ description: Get resources for a given ordo URL.
#+ defaults:
#+   - code: http://www.orpha.net/ORDO/Orphanet_98056
#+   - _country: NA
#+   - _resourceType: NA
#+ transform: {
#+     "apiVersion": "v0.2",
#+     "resourceResponses": {"id": "?id", "name": "?title", "type": "?resource_type_label", "description": "?description", "homepage": "?homepage", "publisher": { "id": "Orphanet", "name": "Orphanet", "location": { "id": "?country_code", "country": "?country_name"} }},
#+     "$anchor": "apiVersion"
#+   }

PREFIX dcat: <http://www.w3.org/ns/dcat#>
PREFIX ejp: <http://purl.org/ejp-rd/vocabulary/>
PREFIX dcterm: <http://purl.org/dc/terms/>
PREFIX fdo: <http://rdf.biosemantics.org/ontologies/fdp-o#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>

SELECT DISTINCT ?resource ?id ?title ?description ?resource_type_label ?homepage ?country_name ?country_code WHERE {   
    # Query external triple store to enrich our input disease list
    SERVICE <http://ejprd.fair-dtls.surf-hosted.nl:7200/repositories/ordo> {
        # GET all the sub diseases of our input disease class
        ?ordo_class_iri  rdfs:subClassOf* ?_code_iri;
                         rdfs:label ?disease_name.
    }

    ?resource a ?type ;
             dcat:theme ?ordo_class_iri;
             dcterm:description ?description;
             dcterm:title ?title;
             dcterm:publisher [dcterm:spatial ?publisher_location];
             dcat:landingPage ?homepage;
             fdo:metadataIdentifier [ dcterm:identifier ?id].

   ?type skos:altLabel ?resource_type_label .

   OPTIONAL {?rs skos:altLabel ?__resourceType}
   FILTER(regex(str(?__resourceType), ?resource_type_label) || str(?__resourceType) = "NA")

   ?publisher_location skos:relatedMatch ?wiki_data_uri.

   ?wiki_data_uri rdfs:label ?country_name;
                   wdt:P297 ?country_code.

   OPTIONAL {?s wdt:P297 ?__country}
   FILTER(regex(str(?__country), ?country_code) || str(?__country) = "NA")
}

Thanks @rajaram5 for helping with this issue!