openlink / virtuoso-opensource

Virtuoso is a high-performance and scalable Multi-Model RDBMS, Data Integration Middleware, Linked Data Deployment, and HTTP Application Server Platform
https://vos.openlinksw.com
Other
854 stars 211 forks source link

SPARQL query behaving weird, possibly based on BIND clause position #948

Open jakubklimek opened 3 years ago

jakubklimek commented 3 years ago

I have the following query, used to harvest DCAT-AP metadata from local data catalogs implemented as SPARQL endpoints, often powered by Virtuoso Open-Source. For some reason, it behaves very strangely. When run directly against Virtuoso's SPARQL endpoint, the browser says ERR_CONNECTION_CLOSED. It is meant to be run (the data for it is in) https://data.mvcr.gov.cz/sparql. However, it behaves this way even on other instances, such as:

On the other hand, it works on (returns empty result): https://data.mpsv.cz/sparql (07.20.3215)

When run via Yasgui, or LinkedPipes ETL SPARQL querying component using rdf4j on https://data.mvcr.gov.cz/sparql, I get some results, but they seem wrong.

Below are 4 variants of the query, which IMHO should all work and result in the same results. I apologize for the length of the query, but changing it affects the issue (numbers of results, etc.):

PREFIX dcat: <http://www.w3.org/ns/dcat#>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX schema: <http://schema.org/>
PREFIX vcard: <http://www.w3.org/2006/vcard/ns#>

PREFIX pu: <https://data.gov.cz/slovník/podmínky-užití/>

CONSTRUCT {
  <https://catalog> a dcat:Catalog ;
    dcat:dataset ?dataset .

  ?dataset a dcat:Dataset ;
    dcterms:title ?title ;
    dcterms:description ?description ;
    dcat:theme ?theme ;
    dcterms:accrualPeriodicity ?accrualPeriodicity ;
    dcat:keyword ?keyword ;
    dcterms:spatial ?spatial ;
    dcterms:temporal ?temporal ;
    dcat:contactPoint ?cp ;
    foaf:page ?page ;
    dcterms:conformsTo ?conformsTo ;
    dcat:spatialResolutionInMeters ?spatialResolution ;
    dcat:temporalResolution ?temporalResolution ;
    dcterms:isPartOf ?topDataset ;
    dcat:distribution ?distribution .

  ?cp a ?cptype; 
    vcard:fn ?cpfn ;
    vcard:hasEmail ?cpemail .

  ?temporal a dcterms:PeriodOfTime; 
    dcat:startDate ?finalStartDate ;
    dcat:endDate ?finalEndDate .

  ?distribution a dcat:Distribution ;
    dcat:downloadURL ?ddURL ;
    dcat:accessURL ?daURL ;
    dcterms:format ?dformat ;
    dcat:mediaType ?dmimeType ;
    dcterms:conformsTo ?dconformsTo ;
    dcat:compressFormat ?dcompressFormat ;
    dcat:packageFormat ?dpackageFormat ;
    dcterms:title ?dtitle .

  ?distribution pu:specifikace ?tou .
  ?tou a pu:Specifikace ;
    pu:autorské-dílo ?touGeneral ;
    pu:databáze-jako-autorské-dílo ?touDatabase ;
    pu:databáze-chráněná-zvláštními-právy ?touDatabaseExtra ;
    pu:osobní-údaje ?touPersonalData ;
    pu:autor ?touGeneralAuthor ;
    pu:autor-databáze ?touDatabaselAuthor .

  ?distribution dcat:accessService ?dataService . 

  ?dataService a dcat:DataService ;
    dcterms:title ?sTitle ;
    dcterms:conformsTo ?sConformsTo ;
    dcat:endpointURL ?sEndpointURL ;
    dcat:endpointDescription ?sEndpointDescription .    
}
WHERE { 
  VALUES ?dataset { <https://data.mvcr.gov.cz/zdroj/datové-sady/rpp/agendy> }

  ?dataset a dcat:Dataset ;
    dcterms:title ?title ;
    dcterms:description ?description ;
    dcat:theme ?theme ;
    dcterms:accrualPeriodicity ?accrualPeriodicity ;
    dcat:keyword ?keyword ;
    dcterms:spatial ?spatial .

  OPTIONAL {
    ?dataset dcterms:temporal ?temporal . 
    OPTIONAL { ?temporal dcat:startDate ?startDate . }
    OPTIONAL { ?temporal dcat:endDate ?endDate . }
    OPTIONAL { ?temporal schema:startDate ?schemaStartDate . }
    OPTIONAL { ?temporal schema:endDate ?schemaEndDate . }
    BIND(IF(BOUND(?startDate), ?startDate, ?schemaStartDate) AS ?finalStartDate)
    BIND(IF(BOUND(?endDate), ?endDate, ?schemaEndDate) AS ?finalEndDate)
  }

  OPTIONAL {
    ?dataset dcat:contactPoint ?cp . 

    ?cp a ?cptype.
    OPTIONAL { ?cp vcard:fn ?cpfn . }
    OPTIONAL { ?cp vcard:hasEmail ?cpemail . }
  }

  OPTIONAL { ?dataset foaf:page ?page . }
  OPTIONAL { ?dataset dcterms:conformsTo ?conformsTo . }
  OPTIONAL { ?dataset dcat:spatialResolutionInMeters ?spatialResolution . }
  OPTIONAL { ?dataset dcat:temporalResolution ?temporalResolution . }
  OPTIONAL { ?dataset dcterms:isPartOf ?topDataset . }

  OPTIONAL {
    ?dataset dcat:distribution ?distribution .
    FILTER(isIRI(?distribution))

    ?distribution a dcat:Distribution ;
             dcat:accessURL ?daURL ;
             pu:specifikace ?tou .

    ?tou a pu:Specifikace ;
             pu:autorské-dílo ?touGeneral ;
             pu:databáze-jako-autorské-dílo ?touDatabase ;
             pu:databáze-chráněná-zvláštními-právy ?touDatabaseExtra ;
             pu:osobní-údaje ?touPersonalData .

    OPTIONAL { ?tou pu:autor ?touGeneralAuthor . }
    OPTIONAL { ?tou pu:autor-databáze ?touDatabaselAuthor . }

    OPTIONAL { ?distribution dcat:downloadURL ?ddURL . }
    OPTIONAL { ?distribution dcterms:format ?dformat . }
    OPTIONAL { ?distribution dcat:mediaType ?dmimeType . }
    OPTIONAL { ?distribution dcterms:conformsTo ?dconformsTo . }
    OPTIONAL { ?distribution dcat:compressFormat ?dcompressFormat . }
    OPTIONAL { ?distribution dcat:packageFormat ?dpackageFormat . }
    OPTIONAL { ?distribution dcterms:title ?dtitle . }

    OPTIONAL { 
      ?distribution dcat:accessService ?dataService . 
      FILTER(isIRI(?dataService))

      ?dataService a dcat:DataService ;
                    dcterms:title ?sTitle ;
                    dcat:endpointURL ?sEndpointURL .
      OPTIONAL { ?dataService dcat:endpointDescription ?sEndpointDescription .}
      OPTIONAL { ?dataService dcterms:conformsTo ?sConformsTo .}
    }
  } 
}

returns 23 results. When I remove the OPTIONAL surrounding the dcat:Distribution, which should only decrease the number of results, or keep it the same, I get 73 results:

PREFIX dcat: <http://www.w3.org/ns/dcat#>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX schema: <http://schema.org/>
PREFIX vcard: <http://www.w3.org/2006/vcard/ns#>

PREFIX pu: <https://data.gov.cz/slovník/podmínky-užití/>

CONSTRUCT {
<https://catalog> a dcat:Catalog ;
    dcat:dataset ?dataset .

?dataset a dcat:Dataset ;
    dcterms:title ?title ;
    dcterms:description ?description ;
    dcat:theme ?theme ;
    dcterms:accrualPeriodicity ?accrualPeriodicity ;
    dcat:keyword ?keyword ;
    dcterms:spatial ?spatial ;
    dcterms:temporal ?temporal ;
    dcat:contactPoint ?cp ;
    foaf:page ?page ;
    dcterms:conformsTo ?conformsTo ;
    dcat:spatialResolutionInMeters ?spatialResolution ;
    dcat:temporalResolution ?temporalResolution ;
    dcterms:isPartOf ?topDataset ;
    dcat:distribution ?distribution .

    ?cp a ?cptype; 
        vcard:fn ?cpfn ;
        vcard:hasEmail ?cpemail .

    ?temporal a dcterms:PeriodOfTime; 
        dcat:startDate ?finalStartDate ;
        dcat:endDate ?finalEndDate .

  ?distribution a dcat:Distribution ;
      dcat:downloadURL ?ddURL ;
      dcat:accessURL ?daURL ;
      dcterms:format ?dformat ;
      dcat:mediaType ?dmimeType ;
      dcterms:conformsTo ?dconformsTo ;
      dcat:compressFormat ?dcompressFormat ;
      dcat:packageFormat ?dpackageFormat ;
      dcterms:title ?dtitle .

    ?distribution pu:specifikace ?tou .
    ?tou a pu:Specifikace ;
        pu:autorské-dílo ?touGeneral ;
        pu:databáze-jako-autorské-dílo ?touDatabase ;
        pu:databáze-chráněná-zvláštními-právy ?touDatabaseExtra ;
        pu:osobní-údaje ?touPersonalData ;
        pu:autor ?touGeneralAuthor ;
        pu:autor-databáze ?touDatabaselAuthor .

    ?distribution dcat:accessService ?dataService . 

    ?dataService a dcat:DataService ;
        dcterms:title ?sTitle ;
        dcterms:conformsTo ?sConformsTo ;
        dcat:endpointURL ?sEndpointURL ;
        dcat:endpointDescription ?sEndpointDescription .    
}
WHERE { 
  VALUES ?dataset { <https://data.mvcr.gov.cz/zdroj/datové-sady/rpp/agendy> }

  ?dataset a dcat:Dataset ;
      dcterms:title ?title ;
      dcterms:description ?description ;
      dcat:theme ?theme ;
      dcterms:accrualPeriodicity ?accrualPeriodicity ;
      dcat:keyword ?keyword ;
      dcterms:spatial ?spatial .

    OPTIONAL {
      ?dataset dcterms:temporal ?temporal . 
      OPTIONAL { ?temporal dcat:startDate ?startDate . }
      OPTIONAL { ?temporal dcat:endDate ?endDate . }
      OPTIONAL { ?temporal schema:startDate ?schemaStartDate . }
      OPTIONAL { ?temporal schema:endDate ?schemaEndDate . }
      BIND(IF(BOUND(?startDate), ?startDate, ?schemaStartDate) AS ?finalStartDate)
      BIND(IF(BOUND(?endDate), ?endDate, ?schemaEndDate) AS ?finalEndDate)
    }

    OPTIONAL {
      ?dataset dcat:contactPoint ?cp . 

      ?cp a ?cptype.
      OPTIONAL { ?cp vcard:fn ?cpfn . }
      OPTIONAL { ?cp vcard:hasEmail ?cpemail . }
    }

    OPTIONAL { ?dataset foaf:page ?page . }
    OPTIONAL { ?dataset dcterms:conformsTo ?conformsTo . }
    OPTIONAL { ?dataset dcat:spatialResolutionInMeters ?spatialResolution . }
    OPTIONAL { ?dataset dcat:temporalResolution ?temporalResolution . }
    OPTIONAL { ?dataset dcterms:isPartOf ?topDataset . }

      ?dataset dcat:distribution ?distribution .
      FILTER(isIRI(?distribution))

      ?distribution a dcat:Distribution ;
          dcat:accessURL ?daURL ;
          pu:specifikace ?tou .

      ?tou a pu:Specifikace ;
        pu:autorské-dílo ?touGeneral ;
        pu:databáze-jako-autorské-dílo ?touDatabase ;
        pu:databáze-chráněná-zvláštními-právy ?touDatabaseExtra ;
        pu:osobní-údaje ?touPersonalData .

        OPTIONAL { ?tou pu:autor ?touGeneralAuthor . }
        OPTIONAL { ?tou pu:autor-databáze ?touDatabaselAuthor . }

        OPTIONAL { ?distribution dcat:downloadURL ?ddURL . }
        OPTIONAL { ?distribution dcterms:format ?dformat . }
        OPTIONAL { ?distribution dcat:mediaType ?dmimeType . }
        OPTIONAL { ?distribution dcterms:conformsTo ?dconformsTo . }
        OPTIONAL { ?distribution dcat:compressFormat ?dcompressFormat . }
        OPTIONAL { ?distribution dcat:packageFormat ?dpackageFormat . }
        OPTIONAL { ?distribution dcterms:title ?dtitle . }

        OPTIONAL { 
          ?distribution dcat:accessService ?dataService . 
          FILTER(isIRI(?dataService))

          ?dataService a dcat:DataService ;
          dcterms:title ?sTitle ;
          dcat:endpointURL ?sEndpointURL .
          OPTIONAL { ?dataService dcat:endpointDescription ?sEndpointDescription .}
          OPTIONAL { ?dataService dcterms:conformsTo ?sConformsTo .}
        }
}

In addition, when I rearrange the OPTIONAL clauses (which should not change the result), I get 67 results:

PREFIX dcat: <http://www.w3.org/ns/dcat#>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX schema: <http://schema.org/>
PREFIX vcard: <http://www.w3.org/2006/vcard/ns#>

PREFIX pu: <https://data.gov.cz/slovník/podmínky-užití/>

CONSTRUCT {
  <https://catalog> a dcat:Catalog ;
    dcat:dataset ?dataset .

  ?dataset a dcat:Dataset ;
    dcterms:title ?title ;
    dcterms:description ?description ;
    dcat:theme ?theme ;
    dcterms:accrualPeriodicity ?accrualPeriodicity ;
    dcat:keyword ?keyword ;
    dcterms:spatial ?spatial ;
    dcterms:temporal ?temporal ;
    dcat:contactPoint ?cp ;
    foaf:page ?page ;
    dcterms:conformsTo ?conformsTo ;
    dcat:spatialResolutionInMeters ?spatialResolution ;
    dcat:temporalResolution ?temporalResolution ;
    dcterms:isPartOf ?topDataset ;
    dcat:distribution ?distribution .

  ?cp a ?cptype; 
    vcard:fn ?cpfn ;
    vcard:hasEmail ?cpemail .

  ?temporal a dcterms:PeriodOfTime; 
    dcat:startDate ?finalStartDate ;
    dcat:endDate ?finalEndDate .

  ?distribution a dcat:Distribution ;
    dcat:downloadURL ?ddURL ;
    dcat:accessURL ?daURL ;
    dcterms:format ?dformat ;
    dcat:mediaType ?dmimeType ;
    dcterms:conformsTo ?dconformsTo ;
    dcat:compressFormat ?dcompressFormat ;
    dcat:packageFormat ?dpackageFormat ;
    dcterms:title ?dtitle .

  ?distribution pu:specifikace ?tou .
  ?tou a pu:Specifikace ;
    pu:autorské-dílo ?touGeneral ;
    pu:databáze-jako-autorské-dílo ?touDatabase ;
    pu:databáze-chráněná-zvláštními-právy ?touDatabaseExtra ;
    pu:osobní-údaje ?touPersonalData ;
    pu:autor ?touGeneralAuthor ;
    pu:autor-databáze ?touDatabaselAuthor .

  ?distribution dcat:accessService ?dataService . 

  ?dataService a dcat:DataService ;
    dcterms:title ?sTitle ;
    dcterms:conformsTo ?sConformsTo ;
    dcat:endpointURL ?sEndpointURL ;
    dcat:endpointDescription ?sEndpointDescription .    
}
WHERE { 
  VALUES ?dataset { <https://data.mvcr.gov.cz/zdroj/datové-sady/rpp/agendy> }

  ?dataset a dcat:Dataset ;
    dcterms:title ?title ;
    dcterms:description ?description ;
    dcat:theme ?theme ;
    dcterms:accrualPeriodicity ?accrualPeriodicity ;
    dcat:keyword ?keyword ;
    dcterms:spatial ?spatial .

  OPTIONAL {
    ?dataset dcat:distribution ?distribution .
    FILTER(isIRI(?distribution))

    ?distribution a dcat:Distribution ;
             dcat:accessURL ?daURL ;
             pu:specifikace ?tou .

    ?tou a pu:Specifikace ;
             pu:autorské-dílo ?touGeneral ;
             pu:databáze-jako-autorské-dílo ?touDatabase ;
             pu:databáze-chráněná-zvláštními-právy ?touDatabaseExtra ;
             pu:osobní-údaje ?touPersonalData .

    OPTIONAL { ?tou pu:autor ?touGeneralAuthor . }
    OPTIONAL { ?tou pu:autor-databáze ?touDatabaselAuthor . }

    OPTIONAL { ?distribution dcat:downloadURL ?ddURL . }
    OPTIONAL { ?distribution dcterms:format ?dformat . }
    OPTIONAL { ?distribution dcat:mediaType ?dmimeType . }
    OPTIONAL { ?distribution dcterms:conformsTo ?dconformsTo . }
    OPTIONAL { ?distribution dcat:compressFormat ?dcompressFormat . }
    OPTIONAL { ?distribution dcat:packageFormat ?dpackageFormat . }
    OPTIONAL { ?distribution dcterms:title ?dtitle . }

    OPTIONAL { 
      ?distribution dcat:accessService ?dataService . 
      FILTER(isIRI(?dataService))

      ?dataService a dcat:DataService ;
                    dcterms:title ?sTitle ;
                    dcat:endpointURL ?sEndpointURL .
      OPTIONAL { ?dataService dcat:endpointDescription ?sEndpointDescription .}
      OPTIONAL { ?dataService dcterms:conformsTo ?sConformsTo .}
    }
  }

  OPTIONAL {
    ?dataset dcterms:temporal ?temporal . 
    OPTIONAL { ?temporal dcat:startDate ?startDate . }
    OPTIONAL { ?temporal dcat:endDate ?endDate . }
    OPTIONAL { ?temporal schema:startDate ?schemaStartDate . }
    OPTIONAL { ?temporal schema:endDate ?schemaEndDate . }
    BIND(IF(BOUND(?startDate), ?startDate, ?schemaStartDate) AS ?finalStartDate)
    BIND(IF(BOUND(?endDate), ?endDate, ?schemaEndDate) AS ?finalEndDate)
  }

  OPTIONAL {
    ?dataset dcat:contactPoint ?cp . 

    ?cp a ?cptype.
    OPTIONAL { ?cp vcard:fn ?cpfn . }
    OPTIONAL { ?cp vcard:hasEmail ?cpemail . }
  }

  OPTIONAL { ?dataset foaf:page ?page . }
  OPTIONAL { ?dataset dcterms:conformsTo ?conformsTo . }
  OPTIONAL { ?dataset dcat:spatialResolutionInMeters ?spatialResolution . }
  OPTIONAL { ?dataset dcat:temporalResolution ?temporalResolution . }
  OPTIONAL { ?dataset dcterms:isPartOf ?topDataset . }

}

now, if I move the BIND clauses from the dcterms:temporal OPTIONAL clause, I get 73 results in all cases, but the Virtuoso's endpoint still drops the HTTP connection when issues directly via browser:

PREFIX dcat: <http://www.w3.org/ns/dcat#>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX schema: <http://schema.org/>
PREFIX vcard: <http://www.w3.org/2006/vcard/ns#>

PREFIX pu: <https://data.gov.cz/slovník/podmínky-užití/>

CONSTRUCT {
  <https://catalog> a dcat:Catalog ;
    dcat:dataset ?dataset .

  ?dataset a dcat:Dataset ;
    dcterms:title ?title ;
    dcterms:description ?description ;
    dcat:theme ?theme ;
    dcterms:accrualPeriodicity ?accrualPeriodicity ;
    dcat:keyword ?keyword ;
    dcterms:spatial ?spatial ;
    dcterms:temporal ?temporal ;
    dcat:contactPoint ?cp ;
    foaf:page ?page ;
    dcterms:conformsTo ?conformsTo ;
    dcat:spatialResolutionInMeters ?spatialResolution ;
    dcat:temporalResolution ?temporalResolution ;
    dcterms:isPartOf ?topDataset ;
    dcat:distribution ?distribution .

  ?cp a ?cptype; 
    vcard:fn ?cpfn ;
    vcard:hasEmail ?cpemail .

  ?temporal a dcterms:PeriodOfTime; 
    dcat:startDate ?finalStartDate ;
    dcat:endDate ?finalEndDate .

  ?distribution a dcat:Distribution ;
    dcat:downloadURL ?ddURL ;
    dcat:accessURL ?daURL ;
    dcterms:format ?dformat ;
    dcat:mediaType ?dmimeType ;
    dcterms:conformsTo ?dconformsTo ;
    dcat:compressFormat ?dcompressFormat ;
    dcat:packageFormat ?dpackageFormat ;
    dcterms:title ?dtitle .

  ?distribution pu:specifikace ?tou .
  ?tou a pu:Specifikace ;
    pu:autorské-dílo ?touGeneral ;
    pu:databáze-jako-autorské-dílo ?touDatabase ;
    pu:databáze-chráněná-zvláštními-právy ?touDatabaseExtra ;
    pu:osobní-údaje ?touPersonalData ;
    pu:autor ?touGeneralAuthor ;
    pu:autor-databáze ?touDatabaselAuthor .

  ?distribution dcat:accessService ?dataService . 

  ?dataService a dcat:DataService ;
    dcterms:title ?sTitle ;
    dcterms:conformsTo ?sConformsTo ;
    dcat:endpointURL ?sEndpointURL ;
    dcat:endpointDescription ?sEndpointDescription .    
}
WHERE { 
  VALUES ?dataset { <https://data.mvcr.gov.cz/zdroj/datové-sady/rpp/agendy> }

  ?dataset a dcat:Dataset ;
    dcterms:title ?title ;
    dcterms:description ?description ;
    dcat:theme ?theme ;
    dcterms:accrualPeriodicity ?accrualPeriodicity ;
    dcat:keyword ?keyword ;
    dcterms:spatial ?spatial .

  OPTIONAL {
    ?dataset dcterms:temporal ?temporal . 
    OPTIONAL { ?temporal dcat:startDate ?startDate . }
    OPTIONAL { ?temporal dcat:endDate ?endDate . }
    OPTIONAL { ?temporal schema:startDate ?schemaStartDate . }
    OPTIONAL { ?temporal schema:endDate ?schemaEndDate . }
  }
  BIND(IF(BOUND(?startDate), ?startDate, ?schemaStartDate) AS ?finalStartDate)
  BIND(IF(BOUND(?endDate), ?endDate, ?schemaEndDate) AS ?finalEndDate)

  OPTIONAL {
    ?dataset dcat:contactPoint ?cp . 

    ?cp a ?cptype.
    OPTIONAL { ?cp vcard:fn ?cpfn . }
    OPTIONAL { ?cp vcard:hasEmail ?cpemail . }
  }

  OPTIONAL { ?dataset foaf:page ?page . }
  OPTIONAL { ?dataset dcterms:conformsTo ?conformsTo . }
  OPTIONAL { ?dataset dcat:spatialResolutionInMeters ?spatialResolution . }
  OPTIONAL { ?dataset dcat:temporalResolution ?temporalResolution . }
  OPTIONAL { ?dataset dcterms:isPartOf ?topDataset . }

  OPTIONAL {
    ?dataset dcat:distribution ?distribution .
    FILTER(isIRI(?distribution))

    ?distribution a dcat:Distribution ;
             dcat:accessURL ?daURL ;
             pu:specifikace ?tou .

    ?tou a pu:Specifikace ;
             pu:autorské-dílo ?touGeneral ;
             pu:databáze-jako-autorské-dílo ?touDatabase ;
             pu:databáze-chráněná-zvláštními-právy ?touDatabaseExtra ;
             pu:osobní-údaje ?touPersonalData .

    OPTIONAL { ?tou pu:autor ?touGeneralAuthor . }
    OPTIONAL { ?tou pu:autor-databáze ?touDatabaselAuthor . }

    OPTIONAL { ?distribution dcat:downloadURL ?ddURL . }
    OPTIONAL { ?distribution dcterms:format ?dformat . }
    OPTIONAL { ?distribution dcat:mediaType ?dmimeType . }
    OPTIONAL { ?distribution dcterms:conformsTo ?dconformsTo . }
    OPTIONAL { ?distribution dcat:compressFormat ?dcompressFormat . }
    OPTIONAL { ?distribution dcat:packageFormat ?dpackageFormat . }
    OPTIONAL { ?distribution dcterms:title ?dtitle . }

    OPTIONAL { 
      ?distribution dcat:accessService ?dataService . 
      FILTER(isIRI(?dataService))

      ?dataService a dcat:DataService ;
                    dcterms:title ?sTitle ;
                    dcat:endpointURL ?sEndpointURL .
      OPTIONAL { ?dataService dcat:endpointDescription ?sEndpointDescription .}
      OPTIONAL { ?dataService dcterms:conformsTo ?sConformsTo .}
    }
  } 
}
TallTed commented 3 years ago

First thing to be certain of is that your queries are not running into Anytime Query cutoffs (i.e., that they're not being stopped after partial execution), and therefore returning partial results. This is most easily checked by running your queries through one of these command-line tools:


Next, the locations of the BIND, OPTIONAL, and other clauses may impact the results in sometimes surprising ways.

Most commonly, this results from the "inside-out" (often confusingly called "bottom-up") execution order of SPARQL subqueries. That is, SPARQL subqueries are executed from the deepest up to the shallowest.

It's difficult to quickly see whether this is what's happening for you, given then length of these queries. It would help if you annotated the query variants with comment lines bracketing the sections you've shifted vertically, and marking where they were moved from/to. It can also be helpful to use different horizontal indentation patterns -- e.g., putting opening brackets/braces/parentheses on newlines, and increasing the indent with each such segment; decreasing the indent with the closing brackets/braces/parentheses, which is vertically aligned with its respective opener -- e.g., space-padding the widths of subject/predicate/object terms, so that these columns are visibly obvious.


Presuming that neither of the above apply in this case, analysis of what's happening typically requires submission of the query profiles and execution plans, as well as their SQL translations.

These will be easiest to work with if you attach them as files, rather than pasting them into comments on this issue.


As your recent issues aren't clearly bugs, it may make sense to shift some or all to the OpenLink Community Forum where there are more active participants than in this issues arena...