visualize-admin / visualization-tool

The tool for visualizing Swiss Open Government Data. Project ownership: Federal Office for the Environment FOEN
https://visualize.admin.ch
BSD 3-Clause "New" or "Revised" License
29 stars 3 forks source link

Segmentation doesn't work / Filter doesn't work #1383

Open LiamBAFU opened 4 months ago

LiamBAFU commented 4 months ago

When I want to Select Greenhouse Gas in the Segmentation its loading really long and after some Time I get a Error message.

image

To reproduce the Bug open this Dataset Link 1 Then change the Segmentation to Greenhouse Gas.

The weird thing is if i do the same thing but without the filter on it works normally Link 2 When you Select the Segmentation Greenhouse Gas here it works normally.

Rdataflow commented 3 months ago

GQL to reproduce this issue

curl 'https://int.visualize.admin.ch/api/graphql' -H 'content-type: application/json' -H 'x-visualize-debug: true' --data-raw $'{"query":"query DataCubeObservations($sourceType: String\u0021, $sourceUrl: String\u0021, $locale: String\u0021, $cubeFilter: DataCubeObservationFilter\u0021) {\\n  dataCubeObservations(\\n    sourceType: $sourceType\\n    sourceUrl: $sourceUrl\\n    locale: $locale\\n    cubeFilter: $cubeFilter\\n  )\\n}\\n","operationName":"DataCubeObservations","variables":{"locale":"en","sourceType":"sparql","sourceUrl":"https://lindas.admin.ch/query","cubeFilter":{"iri":"https://environment.ld.admin.ch/foen/ubd000502/4","filters":{"https://environment.ld.admin.ch/foen/ubd000502/gas":{"type":"multi","values":{"https://ld.admin.ch/cube/dimension/pol01air/pol11":true,"https://ld.admin.ch/cube/dimension/pol01air/pol9":true,"https://ld.admin.ch/cube/dimension/pol01air/pol12":true,"https://ld.admin.ch/cube/dimension/pol01air/pol20":true,"https://ld.admin.ch/cube/dimension/pol01air/pol15":true,"https://ld.admin.ch/cube/dimension/pol01air/pol16":true,"https://ld.admin.ch/cube/dimension/pol01air/pol14":true}},"https://environment.ld.admin.ch/foen/ubd000502/jahr":{"type":"single","value":"1990"},"https://environment.ld.admin.ch/foen/ubd000502/sektorid":{"type":"multi","values":{"https://environment.ld.admin.ch/vocabulary/ghg_emission_sectors_co2_ordinance/1":true,"https://environment.ld.admin.ch/vocabulary/ghg_emission_sectors_co2_ordinance/11":true}}}}}}'

SPARQL to reproduce

#pragma describe.strategy cbd
#pragma join.hash off

#pragma describe.strategy cbd
#pragma join.hash off

SELECT DISTINCT ?dimension0 ?dimension1 ?dimension2 ?dimension3 ?dimension4 ?dimension5 WHERE {
  <https://environment.ld.admin.ch/foen/ubd000502/4> <https://cube.link/observationSet> ?observationSet0 .
  ?observationSet0 <https://cube.link/observation> ?source0 .
  ?source0 <https://environment.ld.admin.ch/foen/ubd000502/sektorid> ?dimension0 .
  ?source0 <https://environment.ld.admin.ch/foen/ubd000502/gas> ?dimension1 .
  ?source0 <https://environment.ld.admin.ch/foen/ubd000502/jahr> ?dimension2 .
  ?source0 <https://environment.ld.admin.ch/foen/ubd000502/werte> ?dimension3 .
  FILTER (
    (?dimension1 IN (<https://ld.admin.ch/cube/dimension/pol01air/pol11>, <https://ld.admin.ch/cube/dimension/pol01air/pol9>, <https://ld.admin.ch/cube/dimension/pol01air/pol20>, <https://ld.admin.ch/cube/dimension/pol01air/pol12>, <https://ld.admin.ch/cube/dimension/pol01air/pol15>, <https://ld.admin.ch/cube/dimension/pol01air/pol16>, <https://ld.admin.ch/cube/dimension/pol01air/pol14>))
  )
  FILTER (
    (?dimension2 = "1990"^^<http://www.w3.org/2001/XMLSchema#gYear>)
  )
  FILTER (
    (?dimension0 IN (<https://environment.ld.admin.ch/vocabulary/ghg_emission_sectors_co2_ordinance/11>, <https://environment.ld.admin.ch/vocabulary/ghg_emission_sectors_co2_ordinance/12>, <https://environment.ld.admin.ch/vocabulary/ghg_emission_sectors_co2_ordinance/13>, <https://environment.ld.admin.ch/vocabulary/ghg_emission_sectors_co2_ordinance/14>))
  )
  OPTIONAL {
    ?dimension0 <http://schema.org/name> ?dimension4_0 .
    FILTER (
      LANGMATCHES(LANG(?dimension4_0), "en")
    )
  }
  OPTIONAL {
    ?dimension0 <http://schema.org/name> ?dimension4_1 .
    FILTER (
      LANGMATCHES(LANG(?dimension4_1), "de")
    )
  }
  OPTIONAL {
    ?dimension0 <http://schema.org/name> ?dimension4_2 .
    FILTER (
     LANGMATCHES(LANG(?dimension4_2), "fr")
    )
  }
  OPTIONAL {
    ?dimension0 <http://schema.org/name> ?dimension4_3 .
    FILTER (
      LANGMATCHES(LANG(?dimension4_3), "it")
    )
  }
  OPTIONAL {
    ?dimension0 <http://schema.org/name> ?dimension4_4 .
    FILTER (
      (LANG(?dimension4_4) = "")
    )
  }
  BIND(COALESCE(?dimension4_0, ?dimension4_1, ?dimension4_2, ?dimension4_3, ?dimension4_4) AS ?dimension4)
  OPTIONAL {
    ?dimension1 <http://schema.org/name> ?dimension5_0 .
    FILTER (
      LANGMATCHES(LANG(?dimension5_0), "en")
    )
  }
  OPTIONAL {
    ?dimension1 <http://schema.org/name> ?dimension5_1 .
    FILTER (
      LANGMATCHES(LANG(?dimension5_1), "de")
    )
  }
  OPTIONAL {
    ?dimension1 <http://schema.org/name> ?dimension5_2 .
    FILTER (
      LANGMATCHES(LANG(?dimension5_2), "fr")
    )
  }
  OPTIONAL {
    ?dimension1 <http://schema.org/name> ?dimension5_3 .
    FILTER (
      LANGMATCHES(LANG(?dimension5_3), "it")
    )
  }
  OPTIONAL {
    ?dimension1 <http://schema.org/name> ?dimension5_4 .
    FILTER (
      (LANG(?dimension5_4) = "")
    )
  }
  BIND(COALESCE(?dimension5_0, ?dimension5_1, ?dimension5_2, ?dimension5_3, ?dimension5_4) AS ?dimension5)
}
GROUP BY ?dimension0 ?dimension1 ?dimension2 ?dimension3 ?dimension4 ?dimension5

failing with error com.complexible.stardog.plan.eval.operator.OperatorException: Uncaught error during query evaluation: IllegalArgumentException: Empty union is prohibited

cc @claudiodigallo

Rdataflow commented 3 months ago

it works if other filters are applied

#pragma describe.strategy cbd
#pragma join.hash off

#pragma describe.strategy cbd
#pragma join.hash off

SELECT DISTINCT ?dimension0 ?dimension1 ?dimension2 ?dimension3 ?dimension4 ?dimension5 WHERE {
  <https://environment.ld.admin.ch/foen/ubd000502/4> <https://cube.link/observationSet> ?observationSet0 .
  ?observationSet0 <https://cube.link/observation> ?source0 .
  ?source0 <https://environment.ld.admin.ch/foen/ubd000502/sektorid> ?dimension0 .
  ?source0 <https://environment.ld.admin.ch/foen/ubd000502/gas> ?dimension1 .
  ?source0 <https://environment.ld.admin.ch/foen/ubd000502/jahr> ?dimension2 .
  ?source0 <https://environment.ld.admin.ch/foen/ubd000502/werte> ?dimension3 .
  FILTER (
    (?dimension1 IN (<https://ld.admin.ch/cube/dimension/pol01air/pol11>, 
    <https://ld.admin.ch/cube/dimension/pol01air/pol9>#,
#    <https://ld.admin.ch/cube/dimension/pol01air/pol20>, 
#    <https://ld.admin.ch/cube/dimension/pol01air/pol12>, 
#    <https://ld.admin.ch/cube/dimension/pol01air/pol15>, 
#    <https://ld.admin.ch/cube/dimension/pol01air/pol16>, 
#    <https://ld.admin.ch/cube/dimension/pol01air/pol14>
    ))
  )
  FILTER (
    (?dimension2 = "2005"^^<http://www.w3.org/2001/XMLSchema#gYear>)
  )
  FILTER (
    (?dimension0 IN (<https://environment.ld.admin.ch/vocabulary/ghg_emission_sectors_co2_ordinance/11>, <https://environment.ld.admin.ch/vocabulary/ghg_emission_sectors_co2_ordinance/12>, <https://environment.ld.admin.ch/vocabulary/ghg_emission_sectors_co2_ordinance/13>, <https://environment.ld.admin.ch/vocabulary/ghg_emission_sectors_co2_ordinance/14>))
  )
  OPTIONAL {
    ?dimension0 <http://schema.org/name> ?dimension4_0 .
    FILTER (
      LANGMATCHES(LANG(?dimension4_0), "en")
    )
  }
  OPTIONAL {
    ?dimension0 <http://schema.org/name> ?dimension4_1 .
    FILTER (
      LANGMATCHES(LANG(?dimension4_1), "de")
    )
  }
  OPTIONAL {
    ?dimension0 <http://schema.org/name> ?dimension4_2 .
    FILTER (
     LANGMATCHES(LANG(?dimension4_2), "fr")
    )
  }
  OPTIONAL {
    ?dimension0 <http://schema.org/name> ?dimension4_3 .
    FILTER (
      LANGMATCHES(LANG(?dimension4_3), "it")
    )
  }
  OPTIONAL {
    ?dimension0 <http://schema.org/name> ?dimension4_4 .
    FILTER (
      (LANG(?dimension4_4) = "")
    )
  }
  BIND(COALESCE(?dimension4_0, ?dimension4_1, ?dimension4_2, ?dimension4_3, ?dimension4_4) AS ?dimension4)
  OPTIONAL {
    ?dimension1 <http://schema.org/name> ?dimension5_0 .
    FILTER (
      LANGMATCHES(LANG(?dimension5_0), "en")
    )
  }
  OPTIONAL {
    ?dimension1 <http://schema.org/name> ?dimension5_1 .
    FILTER (
      LANGMATCHES(LANG(?dimension5_1), "de")
    )
  }
  OPTIONAL {
    ?dimension1 <http://schema.org/name> ?dimension5_2 .
    FILTER (
      LANGMATCHES(LANG(?dimension5_2), "fr")
    )
  }
  OPTIONAL {
    ?dimension1 <http://schema.org/name> ?dimension5_3 .
    FILTER (
      LANGMATCHES(LANG(?dimension5_3), "it")
    )
  }
  OPTIONAL {
    ?dimension1 <http://schema.org/name> ?dimension5_4 .
    FILTER (
      (LANG(?dimension5_4) = "")
    )
  }
  BIND(COALESCE(?dimension5_0, ?dimension5_1, ?dimension5_2, ?dimension5_3, ?dimension5_4) AS ?dimension5)
}
GROUP BY ?dimension0 ?dimension1 ?dimension2 ?dimension3 ?dimension4 ?dimension5

FTR: it seems the set of different filters (gas & sectors) interfere and break the query :(

bprusinowski commented 3 months ago

It looks like the problem is caused by #pragma join.hash off, when removing this line the query does not break. Maybe we could put this issue on hold, as we want to try to rewrite observations query to use CONSTRUCT instead of SELECT (see #1371)?

If completed, hopefully it would fix the problem – otherwise, after going through the Stardog docs, it looks like the pragma should be used conservatively, and now we use it for every query. It already caused some problems with preview queries (cartesian products), maybe we should re-think using it 🤔