openlink / virtuoso-opensource

Virtuoso is a high-performance and scalable Multi-Model RDBMS, Data Integration Middleware, Linked Data Deployment, and HTTP Application Server Platform
https://vos.openlinksw.com
Other
851 stars 214 forks source link

bif:contains + optional #960

Open rapw3k opened 3 years ago

rapw3k commented 3 years ago

Hi, while working with SPARQL transformer libraries, i get the query below. When I try to execute it in my endpoint: https://www.foodie-cloud.org/sparql, I receive error:

Virtuoso 37000 Error SP031: SPARQL compiler: No suitable triple pattern is found for a variable $label3 in special predicate bif:contains() at line 21 of query

This error is just because of the OPTIONAL statement: "OPTIONAL { }", which in fact is empty. If I remove it or move it up (before any statement with ?labelx), the query works just fine. However, as this query is generated automatically, I am not really able to remove it from my application. Is this a bug? can it be overcome?

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dcat: <http://www.w3.org/ns/dcat#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX cybeleBase: <https://w3id.org/cybele/>
PREFIX bif: <bif:>
  SELECT DISTINCT ?s ?p ?o
  WHERE {
    ?s a dcat:Dataset.
    ?s ?p ?o.
    ?s ?p1 ?o1.
    ?o1 skos:prefLabel ?label1.
    ?s ?p2 ?o2.
    ?o2 rdfs:label ?label2.
    ?s dct:title ?label3.
    OPTIONAL { }
    FILTER((bif:contains(?label1,  "monnit") || bif:contains(?label2,  "monnit") || bif:contains(?label3,  "monnit")) && STRSTARTS (str(?s), str(cybeleBase:)))
  }
TallTed commented 3 years ago

That OPTIONAL { } clause is a bit odd, but it shouldn't cause the error you report. I'm surprised that moving it around removed the error, as that seems unlikely to have put a value into ?label3.

What is the tool that is automatically generating this query? Are you running the latest version?

It's possible that there's an error in Virtuoso's query plan and execution.

First thing is to confirm that you're running on the latest available Virtuoso -- so please provide the output of virtuoso -?.

Output of the SPARQL query found here will also be helpful. Please run it with the line for git_head uncommented; if that produces an error instead of query results, re-comment or delete that line. If you're running Enterprise Edition, please uncomment the lines for st_lic_owner and st_lic_serial_number.

Once we know your Virtuoso is current, we can dig into the query plan.

rapw3k commented 3 years ago

hi @TallTed I have the latest release 7.2.6-dev (https://sourceforge.net/projects/virtuoso/files/virtuoso/7.2.6-dev/): OpenLink Virtuoso Server 07.20.3230 Jan 9 2019 aafff5336 -pthreads Linux

You can see how the query is generated here: https://d2klab.github.io/sparql-transformer/ copy the following json definition :

{
    "proto": {
      "id": "?s",
      "metadata": {
          "property": "?p",
          "value": "?o"
      }
    },
    "$where": [
        "?s a dcat:Dataset",
        "?s ?p ?o",        
        "?s ?p1 ?o1",
        "?o1 skos:prefLabel ?label1",
        "?s ?p2 ?o2",
        "?o2 rdfs:label ?label2",
        "?s dct:title ?label3"
    ],
    "$filter": "(bif:contains(?label1,  'monnit') || bif:contains(?label2,  'monnit') || bif:contains(?label3,  'monnit')) && STRSTARTS (str(?s), str(cybeleBase:))",
    "$prefixes": {
      "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
      "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
      "dbo": "http://dbpedia.org/ontology/",
      "dcat": "http://www.w3.org/ns/dcat#",
      "skos": "http://www.w3.org/2004/02/skos/core#",
      "dct": "http://purl.org/dc/terms/",
      "cybeleBase": "https://w3id.org/cybele/",
      "bif": "bif:"
    },
    "$lang": "en",
    "grlc": {
      "summary": "Retrieves all the information about CYBELE datasets including the provided keyword, e.g., greece (not working see: https://github.com/openlink/virtuoso-opensource/issues/960",
      "endpoint": "https://www.foodie-cloud.org/sparql?default-graph-uri=https://w3id.org/cybele/datasets/",
      "tags": [ "test" ],
      "method": "GET",
      "pagination": 100
    }
  }

I know in dev branch there is now 07.20.3232. I don't think that is the case, but we can try in some test instance if you have.

TallTed commented 3 years ago

I see... Of course, none of your data is in the DBpedia instance (which is a much more recent build), but it produces the same error with the auto-generated query, and produces an empty result set (no error) if I move the OPTIONAL { } around.

Also interestingly, neither DBpedia instance nor yours produces any actual EXPLAIN output when the "Generate SPARQL compilation report (instead of executing the query)" option is chosen on the query form -- it just reports multiple iterations of the same error!

I think this one goes to the developers for analysis.

@pvk @openlink @iv-an-ru -- would you please take a look?

rapw3k commented 3 years ago

thanks @TallTed just note that the same happens if the OPTIONAL is not empty. If you change the empty optional with for example:

OPTIONAL { ?s rdfs:type ?type }

you will have same issue....