openlink / virtuoso-opensource

Virtuoso is a high-performance and scalable Multi-Model RDBMS, Data Integration Middleware, Linked Data Deployment, and HTTP Application Server Platform
https://vos.openlinksw.com
Other
853 stars 211 forks source link

Optimizing merge of spatiotemporal datasets #1045

Open davidshumway opened 2 years ago

davidshumway commented 2 years ago

Time period: 10 years. Location: State of CA.

Dataset 1: 300 NOAA weather stations (located in CA). Daily summary of up to 15 variables (e.g. avg air temperature, avg wind speed, max air temperature). Not every variable present per day per station. Roughly 6 million observations in total. Dataset 2: 600 buildings (also located in CA). Single daily observation per building. Most buildings make around 100 observations per year. Roughly 200,000 observations in total.

On average, each building has 50 weather stations within a 200KM radius.

Spatiotemporal merge: Merge nearest weather variable to a building on a given day up to a maximum distance of 200KM per variable.

Example: Building 1 has it's first daily observation on Jan. 4, 2000. The nearest weather station (1KM) has a daily summary of 10 variables. The next nearest weather station (2KM) has 3 other variables. One more weather variable is only at a weather station that is 100KM away. The final weather variable is not found at any station within 200KM.

Final result: The final result should ideally be one row per building observation with an attached column for each weather variable (or something like NaN if no weather variable is available within 200KM). (The ideal result would be outputting this final query e.g. in CSV format which could then be fed into a dask pandas dataframe for use in machine learning tasks.)

Challenges: It would be easy enough to match the nearest weather station to any given building as this is only 300 x 600 distinct distance comparisons in total (180,000). However, in the worst case, if all 15 variables were matched against the average 50 nearest weather stations for every building observation present (200k), that would be 15x50x200000 = 150,000,000 distance calculations.

Any thoughts?

davidshumway commented 2 years ago

Assuming distances are precomputed between every building and weather station, one attempt might look like:

# For every observation from one building, returns observation along with up to 15 associated
# weather observations.
SELECT ?building1 ?buildingValue ?day1 ?minDist ?weatherProperty1 ?weatherValue WHERE {
 ?weatherObs a ex:weatherObservation ;
 ex:hasProperty ?weatherProperty2 ;
 ex:hasSimpleResult ?weatherValue ;
 ex:obsDay ?day2 ;
 ex:hasDistanceToBuilding [
  ex:distance ?dist2 ;
  ex:building ?building2 ].
 FILTER(?dist2 = ?minDist) .
 FILTER(?day2 = ?day1) .
 FILTER(?building2 = ?building1) .
 FILTER(?weatherProperty2 = ?weatherProperty1) .

 # Selects minimum distance (WS to building) on given day for every
 # distinct weatherProperty.
 {SELECT MIN(?dist) AS ?minDist ?day1 ?building1 ?buildingValue ?weatherProperty1 WHERE {
  ?weatherObs a ex:weatherObservation ;
  ex:obsWeatherProperty ?weatherProperty1 ;
  ex:obsDay ?day2 ;
  ex:hasDistanceToBuilding [
   ex:distance ?dist ;
   ex:building ?building2 ] .
  FILTER(?building2 = ?building1) .
  FILTER(?day2 = ?day1) .

  # Selects every daily observation for every building.
  {SELECT ?day1 ?building1 ?buildingValue WHERE {
   ?building1 a ex:buildingObservation ;
   ex:hasSimpleResult ?buildingValue ;
   ex:obsDay ?day1 .
  }}
 } GROUP BY ?day1 ?building1 ?buildingValue ?weatherProperty1 }

}
davidshumway commented 2 years ago

So the query is taking too long to finish. Profiling shows the following:

 156388 msec 67% cpu, 7.44829e+08 rnd 7.31708e+08 seq   97.9652% same seg   1.79578% same pg 
11066 disk reads, 30173 read ahead,  0.048558% wait
Compilation: 1741 msec 0 reads         0% read 0 messages         0% clw

Does that look... good? Bad?

davidshumway commented 2 years ago

In terms of performance is there any advantage to storing the two (large) datasets in separate graphs?

HughWilliams commented 2 years ago

There would generally be no performance advantage in having separate graphs for each dataset.

More importantly, what is the size of your datasets in total triples? Have you Performance Tuned your Virtuoso instance for hosting that many triples in memory, so as not to have to swap to and from disk? Performance will always be degraded if enough memory is not allocated, and if Virtuoso is not configured to make use of that memory.

davidshumway commented 2 years ago

Some notes:

Number of triples:

Total: A little under 50 million.

TallTed commented 2 years ago

50 million triples should be quite happy in 8 GB RAM, as we generally recommend a minimum of 10 bytes per triple, or 500 MB for your 50 million triples.

That said, your query may have some surprising optimizations awaiting. Note, for instance, that SPARQL queries are processed "inside-out"; basically, the innermost subquery is processed first, and the outermost subquery last. This is sometimes confusingly called "bottom-up" (confusing because it's not about last-line-first, but the deepest-first). It's not easy to see which subquery is at what depth, with your current query layout. This is one reason I tend to put a lot of syntactically meaningless whitespace into my SPARQL -- because it clarifies the depths.

# For every observation from one building, returns observation along with up to 15 associated
# weather observations.
SELECT ?building1 
       ?buildingValue
       ?day1
       ?minDist
       ?weatherProperty1
       ?weatherValue
WHERE
  { ?weatherObs a                        ex:weatherObservation ;
                ex:hasProperty           ?weatherProperty2 ;
                ex:hasSimpleResult       ?weatherValue ;
                ex:obsDay                ?day2 ;
                ex:hasDistanceToBuilding [ ex:distance ?dist2 ;
                                           ex:building ?building2 
                                         ].
    filter (            ?dist2 = ?minDist          ) .
    filter (             ?day2 = ?day1             ) .
    filter (        ?building2 = ?building1        ) .
    filter ( ?weatherProperty2 = ?weatherProperty1 ) .

    # Selects minimum distance (WS to building) on given day for every
    # distinct weatherProperty.
    { SELECT ( MIN ( ?dist ) AS ?minDist ) 
                                ?day1
                                ?building1
                                ?buildingValue
                                ?weatherProperty1
      WHERE
        { ?weatherObs a                        ex:weatherObservation ;
                      ex:obsWeatherProperty    ?weatherProperty1 ;
                      ex:obsDay                ?day2 ;
                      ex:hasDistanceToBuilding [ ex:distance ?dist ;
                                                 ex:building ?building2 
                                               ] .
          filter ( ?building2 = ?building1 ) .
          filter (      ?day2 = ?day1      ) .

          # Selects every daily observation for every building.
          { SELECT ?day1
                   ?building1
                   ?buildingValue 
            WHERE
              { ?building1 a                  ex:buildingObservation ;
                           ex:hasSimpleResult ?buildingValue ;
                           ex:obsDay          ?day1 .
              }
          }
        }
      GROUP BY ?day1 ?building1 ?buildingValue ?weatherProperty1 
    }
  }

I wonder whether what you're trying to achieve wouldn't be more efficiently done with inverted subqueries?

davidshumway commented 2 years ago

Thanks for the feedback! Not sure what you mean with inverted subqueries?

I've tried performing the query starting with the weather observations as the innermost subquery and then processing the building subquery last but didn't see much difference in terms of performance. I'm not sure what kind of optimizations are available at this point, hence my reaching out for advice here.

kidehen commented 2 years ago

Thanks for the feedback! Not sure what you mean with inverted subqueries?

I've tried performing the query starting with the weather observations as the innermost subquery and then processing the building subquery last but didn't see much difference in terms of performance. I'm not sure what kind of optimizations are available at this point, hence my reaching out for advice here.

Do you have a live query service endpoint that we could interact with? Do you have the datasets in question in an http-accessible location so that we could upload to one of our instances?

davidshumway commented 2 years ago

Randomly generated data but should suffice. weatherObs.n3.gz is the only generally large file. The script to generate the files is here: https://gist.github.com/davidshumway/23c098cbd52bdf0a84c8530b3353f647. Note that in this example the locations are not restricted to CA.

buildingObs.n3.gz https://drive.google.com/file/d/1-6nOTCTlIzINa1LyjIu22q-fzWvrnrk7/view?usp=sharing

prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix sosa: <http://www.w3.org/ns/sosa/> 
prefix xsd: <http://www.w3.org/2001/XMLSchema#> 
prefix ex: <http://www.example3.com/> 
prefix geo: <http://www.opengis.net/ont/geosparql#> 
prefix geof: <http://www.opengis.net/def/function/geosparql/>

ex:buildingObservation-1835465 a sosa:Observation ;
  sosa:hasFeatureOfInterest ex:building-65 ;
  sosa:resultTime "2014-1-3T00:00:00"^^xsd:dateTime ;
  sosa:hasSimpleResult "9"^^xsd:double ;
  ex:observationType "building"^^xsd:string .

ex:buildingObservation-271856 a sosa:Observation ;
  sosa:hasFeatureOfInterest ex:building-56 ;
  sosa:resultTime "2021-2-21T00:00:00"^^xsd:dateTime ;
  sosa:hasSimpleResult "9"^^xsd:double ;
  ex:observationType "building"^^xsd:string .

weatherObs.n3.gz https://drive.google.com/file/d/1-Cggjteq_gwzY3QJlsFGpGstzgt5IHjP/view?usp=sharing 248MB compressed, 4.3GB in raw format

# prefixes ...

ex:weatherObservation-14879854 a sosa:Observation ;
  sosa:hasFeatureOfInterest ex:weatherStation-154 ;
  sosa:resultTime "2013-5-1T00:00:00"^^xsd:dateTime ;
  sosa:hasProperty "9"^^xsd:string ;
  sosa:hasSimpleResult "6"^^xsd:double ;
  ex:observationType "weather"^^xsd:string .

ex:weatherObservation-10873163 a sosa:Observation ;
  sosa:hasFeatureOfInterest ex:weatherStation-263 ;
  sosa:resultTime "2015-10-8T00:00:00"^^xsd:dateTime ;
  sosa:hasProperty "3"^^xsd:string ;
  sosa:hasSimpleResult "3"^^xsd:double ;
  ex:observationType "weather"^^xsd:string .

# ...

foi.n3.gz https://drive.google.com/file/d/1-J9My5RDjUYMvYancElRwtEtdiImbQ9F/view?usp=sharing

# prefixes ...

ex:weatherStation-0 a sosa:Sensor ;
  ex:geoType "weather"^^xsd:string ;
  geo:asWKT "<http://www.opengis.net/def/crs/EPSG/0/4326> POINT(81.6799480003782 29.058143278701607)"^^geo:wktLiteral .

ex:building-0 a sosa:Sensor ;
  ex:geoType "building"^^xsd:string ;
  geo:asWKT "<http://www.opengis.net/def/crs/EPSG/0/4326> POINT(-45.66468600537044 -89.41492295285423)"^^geo:wktLiteral .

# ...

dist.n3.gz https://drive.google.com/file/d/1-JWzlRY8ZfjrawSXoLYPtwMLad2SPLMV/view?usp=sharing

# prefixes ...

ex:weatherStation-0 ex:hasDistanceToBuilding [
  ex:distance "17567"^^xsd:decimal ;
  ex:building ex:building-0 ] .

ex:building-0 ex:hasDistanceToStation [
  ex:distance "17567"^^xsd:decimal ;
  ex:weatherStation ex:weatherStation-0 ] .

ex:weatherStation-0 ex:hasDistanceToBuilding [
  ex:distance "449"^^xsd:decimal ;
  ex:building ex:building-1 ] .

ex:building-1 ex:hasDistanceToStation [
  ex:distance "449"^^xsd:decimal ;
  ex:weatherStation ex:weatherStation-0 ] .

# ...

And updating the query a bit to match:

prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix sosa: <http://www.w3.org/ns/sosa/> 
prefix xsd: <http://www.w3.org/2001/XMLSchema#> 
prefix ex: <http://www.example3.com/> 
prefix geo: <http://www.opengis.net/ont/geosparql#> 
prefix geof: <http://www.opengis.net/def/function/geosparql/>

SELECT ?building ?buildingValue ?buildingFoi ?day
  ?minDist ?weatherProperty ?weatherValue ?weatherStationFoi
  WHERE {

  ?weatherObs a sosa:Observation ;
    sosa:hasFeatureOfInterest ?weatherStationFoi ;
    sosa:resultTime ?day ;
    sosa:hasProperty ?weatherProperty ;
    ex:observationType ?weatherObsType ;
    sosa:hasSimpleResult ?weatherValue .

  ?weatherStationFoi ex:hasDistanceToBuilding [
    ex:distance ?minDist ;
    ex:building ?buildingFoi ] .

  {SELECT MIN(?distance) AS ?minDist ?building ?buildingValue ?buildingFoi
    ?day ?weatherProperty
    WHERE {

    ?weatherObs a sosa:Observation ;
      sosa:hasFeatureOfInterest ?weatherStationFoi ;
      sosa:resultTime ?day ;
      sosa:hasProperty ?weatherProperty ;
      ex:observationType ?weatherObsType .

    ?weatherStationFoi ex:hasDistanceToBuilding [
      ex:distance ?distance ;
      ex:building ?buildingFoi ] .

    {SELECT ?day ?building ?buildingValue ?buildingFoi where {
      ?building a sosa:Observation ;
        sosa:hasFeatureOfInterest ?buildingFoi ;
        sosa:hasSimpleResult ?buildingValue ;
        sosa:resultTime ?day ;
        ex:observationType ?obsType .
      FILTER(?obsType = "building"^^xsd:string) .
    }}

  } GROUP BY ?building ?buildingValue ?buildingFoi ?day ?weatherProperty
  }
}

Small data sample:

prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix sosa: <http://www.w3.org/ns/sosa/> 
prefix xsd: <http://www.w3.org/2001/XMLSchema#> 
prefix ex: <http://www.example3.com/> 
prefix geo: <http://www.opengis.net/ont/geosparql#> 
prefix geof: <http://www.opengis.net/def/function/geosparql/>

ex:buildingObservation-1 a sosa:Observation ;
sosa:hasFeatureOfInterest ex:building-1 ;
sosa:resultTime "2014-5-1T00:00:00"^^xsd:dateTime ;
sosa:hasSimpleResult "9"^^xsd:double ;
ex:observationType "building"^^xsd:string .

ex:buildingObservation-2 a sosa:Observation ;
sosa:hasFeatureOfInterest ex:building-0 ;
sosa:resultTime "2013-5-1T00:00:00"^^xsd:dateTime ;
sosa:hasSimpleResult "9"^^xsd:double ;
ex:observationType "building"^^xsd:string .

ex:weatherObservation-1 a sosa:Observation ;
sosa:hasFeatureOfInterest ex:weatherStation-0 ;
sosa:resultTime "2013-5-1T00:00:00"^^xsd:dateTime ;
sosa:hasProperty "9"^^xsd:string ;
sosa:hasSimpleResult "111"^^xsd:double ;
ex:observationType "weather"^^xsd:string .

ex:weatherObservation-2 a sosa:Observation ;
sosa:hasFeatureOfInterest ex:weatherStation-1 ;
sosa:resultTime "2013-5-1T00:00:00"^^xsd:dateTime ;
sosa:hasProperty "9"^^xsd:string ;
sosa:hasSimpleResult "222"^^xsd:double ;
ex:observationType "weather"^^xsd:string .

ex:weatherObservation-3 a sosa:Observation ;
sosa:hasFeatureOfInterest ex:weatherStation-0 ;
sosa:resultTime "2014-5-1T00:00:00"^^xsd:dateTime ;
sosa:hasProperty "9"^^xsd:string ;
sosa:hasSimpleResult "333"^^xsd:double ;
ex:observationType "weather"^^xsd:string .

ex:weatherObservation-4 a sosa:Observation ;
sosa:hasFeatureOfInterest ex:weatherStation-1 ;
sosa:resultTime "2014-5-1T00:00:00"^^xsd:dateTime ;
sosa:hasProperty "9"^^xsd:string ;
sosa:hasSimpleResult "444"^^xsd:double ;
ex:observationType "weather"^^xsd:string .

ex:weatherStation-0 a sosa:Sensor ;
  ex:geoType "weather"^^xsd:string ;
  geo:asWKT "<http://www.opengis.net/def/crs/EPSG/0/4326> POINT(81.6799480003782 29.058143278701607)"^^geo:wktLiteral .

ex:weatherStation-1 a sosa:Sensor ;
  ex:geoType "weather"^^xsd:string ;
  geo:asWKT "<http://www.opengis.net/def/crs/EPSG/0/4326> POINT(-45.66468600537044 -89.41492295285423)"^^geo:wktLiteral .

ex:building-0 a sosa:Sensor ;
  ex:geoType "building"^^xsd:string ;
  geo:asWKT "<http://www.opengis.net/def/crs/EPSG/0/4326> POINT(28.801376420865992 20.748063782888806)"^^geo:wktLiteral .

ex:building-1 a sosa:Sensor ;
  ex:geoType "building"^^xsd:string ;
  geo:asWKT "<http://www.opengis.net/def/crs/EPSG/0/4326> POINT(21.114236666465075 -98.00059121036644)"^^geo:wktLiteral .

ex:weatherStation-0 ex:hasDistanceToBuilding [
  ex:distance "17567"^^xsd:decimal ;
  ex:building ex:building-0 ] .

ex:weatherStation-0 ex:hasDistanceToBuilding [
  ex:distance "44236"^^xsd:decimal ;
  ex:building ex:building-1 ] .

ex:weatherStation-1 ex:hasDistanceToBuilding [
  ex:distance "500"^^xsd:decimal ;
  ex:building ex:building-0 ] .

ex:weatherStation-1 ex:hasDistanceToBuilding [
  ex:distance "400"^^xsd:decimal ;
  ex:building ex:building-1 ] .

Query result:

building buildingValue buildingFoi day minDist weatherProperty weatherValue weatherStationFoi
http://www.example3.com/buildingObservation-2 9.0 http://www.example3.com/building-0 2013-5-1T00:00:00 500 9 222.0 http://www.example3.com/weatherStation-1
http://www.example3.com/buildingObservation-1 9.0 http://www.example3.com/building-1 2014-5-1T00:00:00 400 9 444.0 http://www.example3.com/weatherStation-1
davidshumway commented 2 years ago

I'm also a little perplexed about a query (using similar data as above) which performs much better by querying for smaller parts of the result (1/6th to be exact) rather than attempting to retrieve the full result all at once. Neither query runs into any memory issues (i.e. swapping) and both appear to run on a single CPU core. I'd assume execution time would be similar but querying for smaller parts finishes all 6 queries in about 1 minute whereas executing the full query (querying for all 6 parts at once) executes for 15-30 minutes without any result (I gave up before letting the query finish).

The following python code will execute 6 SPARQL queries (1-6) with each query taking approximately 10 seconds to return. In this case, ?o1 is associated with weather stations (6 million weather station observations * 7 triples per observation ~~ 42 million triples) and ?bObs is a set of 30 or so features of interest (sosa:hasFeatureOfInterest ?bObsFoi) associated with a single building project (i.e. project ID: FILTER(?bPID = "CA8"^^xsd:string)) . The 6 separate queries relate to unique weather observations associated with each ?bObs observation, with one query mdae for each of the past 6 days. The query results are merged together into a single pandas dataframe (i.e. mimicking what would be produced by the single full query).

res = None
prefixes = '...'
for i in range(1, 7):
  n = str(i)
  print(n)
  s = prefixes + '''
SELECT ?bObs ?bObsFoi ?AvgF ?CountObs ?y ?m ?d
  GROUP_CONCAT(?nProperty1, ",") AS ?np_minus'''+n+''' 
  GROUP_CONCAT(?nResult1, ",") AS ?nr_minus'''+n+'''  
  GROUP_CONCAT(?nTime1, ",") AS ?nt_minus'''+n+''' 

  WHERE {
  ?bObs a sosa:Observation ;
    sosa:hasFeatureOfInterest ?bObsFoi ;
    sosa:resultTime ?bObsDate ;
    sosa:hasSimpleResult ?AvgF ;
    ex:countObservations ?CountObs ;
    time:year ?y ;
    time:month ?m ;
    time:day ?d ;
    ex:n-'''+n+'''DayObs ?o1 .
  ?bObsFoi ex:BProjectID ?bPID .
  FILTER(?bPID = "CA8"^^xsd:string) .  
  ?o1 a sosa:Observation ;
    sosa:resultTime ?nTime1 ;
    sosa:observedProperty ?nProperty1 ;
    sosa:hasSimpleResult ?nResult1 .
}
'''
  r = run_sparql(s) # executes SPARQL query and parses result into pandas DF
  if i == 1:
    res = r
  else:
    res = res.merge(r.drop(columns=['bObsFoi', 'AvgF', 'CountObs', 'y', 'm', 'd']),
      how='left', left_on='bObs', right_on='bObs')

And the result of the merged queries:

bObs bObsFoi AvgF CountObs y m d np_minus1 nr_minus1 nmd_minus1 nt_minus1 np_minus2 nr_minus2 nmd_minus2 nt_minus2 np_minus3 nr_minus3 nmd_minus3 nt_minus3 np_minus4 nr_minus4 nmd_minus4 nt_minus4 np_minus5 nr_minus5 nmd_minus5 nt_minus5 np_minus6 nr_minus6 nmd_minus6 nt_minus6
http://www.example3.org/BSamplingFOI-CAb_W21-2016-8-2-avg-daily-F http://www.example3.org/BSamplingFOI-CAb_W21 1.658657763991483 3 2016 8 2 DEWP,SLP,STP,TEMP,VISIB,WDSP,FRSHTT,MAX,MIN,MXSPD,PRCP 64.0,1013.1,11.2, 73.4, 9.8, 4.7,000000, 81.0, 69.1, 11.1, 0.00 2016-08-01 00:00:00,2016-08-01 00:00:00,2016-08-01 00:00:00,2016-08-01 00:00:00,2016-08-01 00:00:00,2016-08-01 00:00... DEWP,SLP,STP,TEMP,VISIB,WDSP,FRSHTT,MAX,MIN,MXSPD,PRCP 65.0,1011.6,9.8, 73.3, 9.8, 5.9,000000, 82.0, 69.1, 14.0, 0.00 2016-07-31 00:00:00,2016-07-31 00:00:00,2016-07-31 00:00:00,2016-07-31 00:00:00,2016-07-31 00:00:00,2016-07-31 00:00... DEWP,SLP,STP,TEMP,VISIB,WDSP,FRSHTT,MAX,MIN,MXSPD,PRCP 65.9,1009.9,8.1, 73.9, 9.9, 6.4,000000, 82.0, 69.1, 13.0, 0.00 2016-07-30 00:00:00,2016-07-30 00:00:00,2016-07-30 00:00:00,2016-07-30 00:00:00,2016-07-30 00:00:00,2016-07-30 00:00... DEWP,SLP,STP,TEMP,VISIB,WDSP,FRSHTT,MAX,MIN,MXSPD,PRCP 65.7,1010.1,8, 76.1, 8.6, 5.8,000000, 80.1, 70.0, 8.9, 0.00 2016-07-29 00:00:00,2016-07-29 00:00:00,2016-07-29 00:00:00,2016-07-29 00:00:00,2016-07-29 00:00:00,2016-07-29 00:00... DEWP,SLP,STP,TEMP,VISIB,WDSP,FRSHTT,MAX,MIN,MXSPD,PRCP 64.5,1011.7,9.9, 74.7, 9.8, 5.2,000000, 87.1, 69.1, 8.9, 0.00 2016-07-28 00:00:00,2016-07-28 00:00:00,2016-07-28 00:00:00,2016-07-28 00:00:00,2016-07-28 00:00:00,2016-07-28 00:00... DEWP,SLP,STP,TEMP,VISIB,WDSP,FRSHTT,MAX,MIN,MXSPD,PRCP 63.6,1011.7,9.9, 76.1, 9.9, 4.3,000000, 87.1, 70.0, 12.0, 0.00 2016-07-27 00:00:00,2016-07-27 00:00:00,2016-07-27 00:00:00,2016-07-27 00:00:00,2016-07-27 00:00:00,2016-07-27 00:00...
http://www.example3.org/BSamplingFOI-CAb_W24E-2021-1-19-avg-daily-F http://www.example3.org/BSamplingFOI-CAb_W24E 0.087639213830691 3 2021 1 19 DEWP,MAX,MIN,MXSPD,PRCP,SLP,STP,TEMP,VISIB,WDSP,FRSHTT 38.2, 84.0, 50.0, 8.9, 0.00,1011.2,9.2, 61.9, 10.0, 2.3,000000 2021-01-18 00:00:00,2021-01-18 00:00:00,2021-01-18 00:00:00,2021-01-18 00:00:00,2021-01-18 00:00:00,2021-01-18 00:00... DEWP,MAX,MIN,MXSPD,PRCP,SLP,STP,TEMP,VISIB,WDSP,FRSHTT 38.1, 84.0, 52.0, 8.0, 0.00,1014.7,12.8, 65.3, 10.0, 2.2,000000 2021-01-17 00:00:00,2021-01-17 00:00:00,2021-01-17 00:00:00,2021-01-17 00:00:00,2021-01-17 00:00:00,2021-01-17 00:00... DEWP,MAX,MIN,MXSPD,PRCP,SLP,STP,TEMP,VISIB,WDSP,FRSHTT 39.2, 90.0, 52.0, 8.9, 0.00,1014.9,13, 65.6, 10.0, 2.4,000000 2021-01-16 00:00:00,2021-01-16 00:00:00,2021-01-16 00:00:00,2021-01-16 00:00:00,2021-01-16 00:00:00,2021-01-16 00:00... DEWP,GUST,MAX,MIN,MXSPD,PRCP,SLP,STP,TEMP,VISIB,WDSP,FRSHTT 27.3, 20.0, 90.0, 66.9, 15.0, 0.00,1019.4,17.6, 79.2, 10.0, 7.1,000000 2021-01-15 00:00:00,2021-01-15 00:00:00,2021-01-15 00:00:00,2021-01-15 00:00:00,2021-01-15 00:00:00,2021-01-15 00:00... DEWP,MAX,MIN,MXSPD,PRCP,SLP,STP,TEMP,VISIB,WDSP,FRSHTT 39.4, 78.1, 46.0, 8.9, 0.00,1020.4,18.4, 60.5, 9.9, 1.7,000000 2021-01-14 00:00:00,2021-01-14 00:00:00,2021-01-14 00:00:00,2021-01-14 00:00:00,2021-01-14 00:00:00,2021-01-14 00:00... DEWP,MAX,MIN,MXSPD,PRCP,SLP,STP,TEMP,VISIB,WDSP,FRSHTT 40.5, 75.9, 46.0, 7.0, 0.00,1023.4,21.4, 58.1, 10.0, 1.7,000000 2021-01-13 00:00:00,2021-01-13 00:00:00,2021-01-13 00:00:00,2021-01-13 00:00:00,2021-01-13 00:00:00,2021-01-13 00:00...
http://www.example3.org/BSamplingFOI-CAb_W15-2015-4-14-avg-daily-F http://www.example3.org/BSamplingFOI-CAb_W15 0.082863664642535 3 2015 4 14 DEWP,FRSHTT,MAX,MIN,MXSPD,PRCP,SLP,STP,TEMP,VISIB,WDSP 53.6,000000, 72.0, 55.0, 9.9, 0.00,1017,15.1, 64.1, 9.8, 3.2 2015-04-13 00:00:00,2015-04-13 00:00:00,2015-04-13 00:00:00,2015-04-13 00:00:00,2015-04-13 00:00:00,2015-04-13 00:00... DEWP,FRSHTT,GUST,MAX,MIN,MXSPD,PRCP,SLP,STP,TEMP,VISIB,WDSP 50.9,000000, 15.0, 71.1, 53.1, 11.1, 0.00,1013.2,11.3, 63.0, 10.0, 4.4 2015-04-12 00:00:00,2015-04-12 00:00:00,2015-04-12 00:00:00,2015-04-12 00:00:00,2015-04-12 00:00:00,2015-04-12 00:00... DEWP,FRSHTT,MAX,MIN,MXSPD,PRCP,SLP,STP,TEMP,VISIB,WDSP 49.0,000000, 70.0, 53.1, 11.1, 0.00,1014.4,12.4, 61.7, 10.0, 4.0 2015-04-11 00:00:00,2015-04-11 00:00:00,2015-04-11 00:00:00,2015-04-11 00:00:00,2015-04-11 00:00:00,2015-04-11 00:00... DEWP,FRSHTT,GUST,MAX,MIN,MXSPD,PRCP,SLP,STP,TEMP,VISIB,WDSP 47.3,000000, 15.0, 70.0, 52.0, 8.9, 0.00,1015.2,13.2, 60.8, 10.0, 3.9 2015-04-10 00:00:00,2015-04-10 00:00:00,2015-04-10 00:00:00,2015-04-10 00:00:00,2015-04-10 00:00:00,2015-04-10 00:00... DEWP,FRSHTT,GUST,MAX,MIN,MXSPD,PRCP,SLP,STP,TEMP,VISIB,WDSP 44.5,000000, 15.0, 70.0, 51.1, 8.9, 0.00,1014.7,12.7, 60.9, 10.0, 3.2 2015-04-09 00:00:00,2015-04-09 00:00:00,2015-04-09 00:00:00,2015-04-09 00:00:00,2015-04-09 00:00:00,2015-04-09 00:00... DEWP,FRSHTT,MAX,MIN,MXSPD,PRCP,SLP,STP,TEMP,VISIB,WDSP 44.4,010000, 66.9, 51.1, 11.1, 0.01,1017.1,15.1, 59.1, 9.8, 4.2 2015-04-08 00:00:00,2015-04-08 00:00:00,2015-04-08 00:00:00,2015-04-08 00:00:00,2015-04-08 00:00:00,2015-04-08 00:00...
http://www.example3.org/BSamplingFOI-CAb_W15-2021-12-6-avg-daily-F http://www.example3.org/BSamplingFOI-CAb_W15 0.665480665597645 3 2021 12 6 DEWP,MAX,MIN,MXSPD,PRCP,SLP,STP,TEMP,VISIB,WDSP,FRSHTT 49.6, 66.9, 52.0, 5.1, 0.00,1019.7,17.7, 58.1, 4.0, 1.6,000000 2021-12-05 00:00:00,2021-12-05 00:00:00,2021-12-05 00:00:00,2021-12-05 00:00:00,2021-12-05 00:00:00,2021-12-05 00:00... DEWP,MAX,MIN,MXSPD,PRCP,SLP,STP,TEMP,VISIB,WDSP,FRSHTT 49.3, 66.0, 55.9, 6.0, 0.00,1019.9,18.1, 59.3, 5.1, 2.4,000000 2021-12-04 00:00:00,2021-12-04 00:00:00,2021-12-04 00:00:00,2021-12-04 00:00:00,2021-12-04 00:00:00,2021-12-04 00:00... DEWP,MAX,MIN,MXSPD,PRCP,SLP,STP,TEMP,VISIB,WDSP,FRSHTT 49.8, 64.9, 55.0, 6.0, 0.00,1019.2,17.3, 59.6, 5.1, 2.2,000000 2021-12-03 00:00:00,2021-12-03 00:00:00,2021-12-03 00:00:00,2021-12-03 00:00:00,2021-12-03 00:00:00,2021-12-03 00:00... DEWP,MAX,MIN,MXSPD,PRCP,SLP,STP,TEMP,VISIB,WDSP,FRSHTT 53.1, 75.0, 53.1, 8.0, 0.00,1019.2,17.1, 58.6, 2.1, 3.2,100000 2021-12-02 00:00:00,2021-12-02 00:00:00,2021-12-02 00:00:00,2021-12-02 00:00:00,2021-12-02 00:00:00,2021-12-02 00:00... DEWP,MAX,MIN,MXSPD,PRCP,SLP,STP,TEMP,VISIB,WDSP,FRSHTT 52.3, 75.0, 53.1, 5.1, 0.00,1016.9,15.3, 60.6, 3.8, 1.0,100000 2021-12-01 00:00:00,2021-12-01 00:00:00,2021-12-01 00:00:00,2021-12-01 00:00:00,2021-12-01 00:00:00,2021-12-01 00:00... DEWP,MAX,MIN,MXSPD,PRCP,SLP,STP,TEMP,VISIB,WDSP,FRSHTT 53.2, 75.0, 52.0, 7.0, 0.00,1016.6,14.8, 60.9, 3.7, 1.8,100000 2021-11-30 00:00:00,2021-11-30 00:00:00,2021-11-30 00:00:00,2021-11-30 00:00:00,2021-11-30 00:00:00,2021-11-30 00:00...
http://www.example3.org/BSamplingFOI-CAb_W09-2015-8-25-avg-daily-F http://www.example3.org/BSamplingFOI-CAb_W09 0.068059462159249 3 2015 8 25 DEWP,FRSHTT,MAX,MIN,MXSPD,PRCP,SLP,STP,TEMP,VISIB,WDSP 64.8,000000, 81.0, 66.9, 9.9, 0.00,1013.6,11.8, 72.8, 9.5, 4.1 2015-08-24 00:00:00,2015-08-24 00:00:00,2015-08-24 00:00:00,2015-08-24 00:00:00,2015-08-24 00:00:00,2015-08-24 00:00... DEWP,FRSHTT,GUST,MAX,MIN,MXSPD,PRCP,SLP,STP,TEMP,VISIB,WDSP 64.4,000000, 14.0, 78.1, 66.2, 8.9, 0.00,1013.1,11.3, 71.9, 8.0, 3.2 2015-08-23 00:00:00,2015-08-23 00:00:00,2015-08-23 00:00:00,2015-08-23 00:00:00,2015-08-23 00:00:00,2015-08-23 00:00... DEWP,FRSHTT,MAX,MIN,MXSPD,PRCP,SLP,STP,TEMP,VISIB,WDSP 62.2,000000, 79.0, 69.1, 9.9, 0.00,1012.1,10.2, 72.2, 9.1, 5.2 2015-08-22 00:00:00,2015-08-22 00:00:00,2015-08-22 00:00:00,2015-08-22 00:00:00,2015-08-22 00:00:00,2015-08-22 00:00... DEWP,FRSHTT,MAX,MIN,MXSPD,PRCP,SLP,STP,TEMP,VISIB,WDSP 62.7,000000, 79.0, 69.1, 11.1, 0.00,1011.1,9.2, 71.9, 9.1, 4.4 2015-08-21 00:00:00,2015-08-21 00:00:00,2015-08-21 00:00:00,2015-08-21 00:00:00,2015-08-21 00:00:00,2015-08-21 00:00... DEWP,FRSHTT,GUST,MAX,MIN,MXSPD,PRCP,SLP,STP,TEMP,VISIB,WDSP 61.7,000000, 15.0, 79.0, 68.0, 8.9, 0.00,1011.4,9.6, 71.6, 10.0, 4.9 2015-08-20 00:00:00,2015-08-20 00:00:00,2015-08-20 00:00:00,2015-08-20 00:00:00,2015-08-20 00:00:00,2015-08-20 00:00... DEWP,FRSHTT,MAX,MIN,MXSPD,PRCP,SLP,STP,TEMP,VISIB,WDSP 62.5,000000, 78.1, 68.0, 8.9, 0.00,1012.4,10.5, 71.3, 9.6, 4.5 2015-08-19 00:00:00,2015-08-19 00:00:00,2015-08-19 00:00:00,2015-08-19 00:00:00,2015-08-19 00:00:00,2015-08-19 00:00...
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...

In contrast, the full query below is identical to above but includes data for all 6 days at once. This query processes for at least 15-30 minutes (haven't tried letting it run longer). Here each ex:n-[day]DayObs triple is associated with a list of up to 15 or so weather observations, one for each weather property (e.g. TEMP, DEWP, etc.) that was taken on the associated past day.

SELECT ?bObs ?bObsFoi ?AvgF ?CountObs ?y ?m ?d
  GROUP_CONCAT(?nProperty1, ",") AS ?np_minus1
  GROUP_CONCAT(?nResult1, ",") AS ?nr_minus1
  GROUP_CONCAT(?nTime1, ",") AS ?nt_minus1
  # ...
  GROUP_CONCAT(?nProperty6, ",") AS ?np_minus6
  GROUP_CONCAT(?nResult6, ",") AS ?nr_minus6
  GROUP_CONCAT(?nTime6, ",") AS ?nt_minus6

  WHERE {
  ?bObs a sosa:Observation ;
    sosa:hasFeatureOfInterest ?bObsFoi ;
    sosa:resultTime ?bObsDate ;
    sosa:hasSimpleResult ?AvgF ;
    ex:countObservations ?CountObs ;
    time:year ?y ;
    time:month ?m ;
    time:day ?d ;
    ex:n-6DayObs ?o6 ;
    ex:n-5DayObs ?o5 ;
    ex:n-4DayObs ?o4 ;
    ex:n-3DayObs ?o3 ;
    ex:n-2DayObs ?o2 ;
    ex:n-1DayObs ?o1 .
  ?bObsFoi ex:BProjectID ?bPID .
  FILTER(?bPID = "CA8"^^xsd:string) .
  ?o1 a sosa:Observation ;
    sosa:resultTime ?nTime1 ;
    sosa:observedProperty ?nProperty1 ;
    sosa:hasSimpleResult ?nResult1 .
  ?o2 a sosa:Observation ;
    sosa:resultTime ?nTime2 ;
    sosa:observedProperty ?nProperty2 ;
    sosa:hasSimpleResult ?nResult2 .
  ?o3 a sosa:Observation ;
    sosa:resultTime ?nTime3 ;
    sosa:observedProperty ?nProperty3 ;
    sosa:hasSimpleResult ?nResult3 .
  ?o4 a sosa:Observation ;
    sosa:resultTime ?nTime4 ;
    sosa:observedProperty ?nProperty4 ;
    sosa:hasSimpleResult ?nResult4 .
  ?o5 a sosa:Observation ;
    sosa:resultTime ?nTime5 ;
    sosa:observedProperty ?nProperty5 ;
    sosa:hasSimpleResult ?nResult5 .
  ?o6 a sosa:Observation ;
    sosa:resultTime ?nTime6 ;
    sosa:observedProperty ?nProperty6 ;
    sosa:hasSimpleResult ?nResult6 .
}

EDIT: Oops! Just realized that this is actually using only a subset of the datasets! The datasets here are not the 50 million triples version but a subset including ~30 buildings, 2 weather stations, ~10k building observations, and ~120k weather observations. So perhaps 1 million triples in total. So this works on this smaller dataset but I haven't yet tried it with the full 50 million triples dataset.

EDIT 2: The long query still didn't finish after letting it run for an hour.

EDIT 3: ex:n-[day]DayObs triples were added to ?bObs entities to explicitly connect a weather observation to a building observation.

EDIT 4: Profile and explain for the full query:

76 msec 0% cpu,     87057 rnd        -3 seq   49.8989% same seg 0.00229732% same pg 
2 disk reads, 0 read ahead, 0.000108171% wait
Compilation: 36362 msec 0 reads         0% read 0 messages         0% clw
REPORT
VARCHAR
_______________________________________________________________________________

  { 
    Precode:
      0: DB.DBA.RDF_MAKE_OBJ_OF_TYPEDSQLVAL$27 := Call DB.DBA.RDF_MAKE_OBJ_OF_TYPEDSQLVAL (<c CA8>,  IRI_ID"...string" , <DB_NULL>)
      7: BReturn 0
    Subquery 29 
      { 
        fork {
            RDF_QUAD   4.4e+04 rows(s_1_88_t39.S$58, s_1_88_t39.O$57)
             inlined  P =  IRI_ID"...hasSimpleResult" 
            RDF_QUAD       0.8 rows(s_1_88_t36.S$61)
             inlined  P =  IRI_ID"...type"  ,  S = s_1_88_t39.S$58 ,  O =  IRI_ID"...Observation" 
            RDF_QUAD_POGS   7.1e-05 rows(s_1_88_t9.S$65, s_1_88_t9.O$64)
             P =  IRI_ID"...n-6DayObs"  ,  O = cast$460
            RDF_QUAD   7.1e-05 rows(s_1_88_t5.S$68)
             inlined  P =  IRI_ID"...obsType"  ,  S = s_1_88_t9.S$65
            RDF_QUAD_POGS       0.8 rows(s_1_88_t0.S$71)
             inlined  P =  IRI_ID"...type"  ,  O =  IRI_ID"...Observation"  ,  S = s_1_88_t5.S$68
            RDF_QUAD         1 rows(s_1_88_t1.O$75, s_1_88_t1.S$74)
             inlined  P =  IRI_ID"...hasFeatureOfInterest"  ,  S = s_1_88_t0.S$71
            RDF_QUAD         1 rows(s_1_88_t2.S$78)
             inlined  P =  IRI_ID"...resultTime"  ,  S = k_s_1_88_t0.S$490
            RDF_QUAD   7.1e-05 rows(s_1_88_t4.S$82, s_1_88_t4.O$81)
             inlined  P =  IRI_ID"...countObservations"  ,  S = k_s_1_88_t0.S$500
            RDF_QUAD         1 rows(s_1_88_t3.S$86, s_1_88_t3.O$85)
             inlined  P =  IRI_ID"...hasSimpleResult"  ,  S = k_s_1_88_t0.S$511
            RDF_QUAD   7.1e-05 rows(s_1_88_t10.S$90, s_1_88_t10.O$89)
             inlined  P =  IRI_ID"...n-5DayObs"  ,  S = k_s_1_88_t0.S$522
            RDF_QUAD         1 rows(s_1_88_t8.S$94, s_1_88_t8.O$93)
             inlined  P =  IRI_ID"...day"  ,  S = k_s_1_88_t0.S$533
            RDF_QUAD   7.1e-05 rows(s_1_88_t11.S$98, s_1_88_t11.O$97)
             inlined  P =  IRI_ID"...n-4DayObs"  ,  S = k_s_1_88_t0.S$544
            RDF_QUAD         1 rows(s_1_88_t7.S$102, s_1_88_t7.O$101)
             inlined  P =  IRI_ID"...month"  ,  S = k_s_1_88_t0.S$555
            RDF_QUAD   7.1e-05 rows(s_1_88_t12.O$106, s_1_88_t12.S$105)
             inlined  P =  IRI_ID"...n-3DayObs"  ,  S = k_s_1_88_t0.S$566
            RDF_QUAD   7.1e-05 rows(s_1_88_t14.O$110, s_1_88_t14.S$109)
             inlined  P =  IRI_ID"...n-1DayObs"  ,  S = k_s_1_88_t0.S$577
            RDF_QUAD   7.1e-05 rows(s_1_88_t13.S$114, s_1_88_t13.O$113)
             inlined  P =  IRI_ID"...n-2DayObs"  ,  S = k_s_1_88_t0.S$588
            RDF_QUAD         1 rows(s_1_88_t37.S$118, s_1_88_t37.O$117)
             inlined  P =  IRI_ID"...resultTime"  ,  S = k_s_1_88_t9.O$599
            RDF_QUAD         1 rows(s_1_88_t33.S$122, s_1_88_t33.O$121)
             inlined  P =  IRI_ID"...resultTime"  ,  S = k_s_1_88_t10.O$610
            RDF_QUAD         1 rows(s_1_88_t29.S$126, s_1_88_t29.O$125)
             inlined  P =  IRI_ID"...resultTime"  ,  S = k_s_1_88_t11.O$621
            RDF_QUAD         1 rows(s_1_88_t25.S$130, s_1_88_t25.O$129)
             inlined  P =  IRI_ID"...resultTime"  ,  S = k_s_1_88_t12.O$632
            RDF_QUAD         1 rows(s_1_88_t21.S$134, s_1_88_t21.O$133)
             inlined  P =  IRI_ID"...resultTime"  ,  S = k_s_1_88_t13.O$643
            RDF_QUAD         1 rows(s_1_88_t17.S$138, s_1_88_t17.O$137)
             inlined  P =  IRI_ID"...resultTime"  ,  S = k_s_1_88_t14.O$654
            RDF_QUAD         1 rows(s_1_88_t35.S$142, s_1_88_t35.O$141)
             inlined  P =  IRI_ID"...hasSimpleResult"  ,  S = k_s_1_88_t10.O$665
            RDF_QUAD         1 rows(s_1_88_t31.S$146, s_1_88_t31.O$145)
             inlined  P =  IRI_ID"...hasSimpleResult"  ,  S = k_s_1_88_t11.O$676
            RDF_QUAD         1 rows(s_1_88_t27.S$150, s_1_88_t27.O$149)
             inlined  P =  IRI_ID"...hasSimpleResult"  ,  S = k_s_1_88_t12.O$687
            RDF_QUAD         1 rows(s_1_88_t23.S$154, s_1_88_t23.O$153)
             inlined  P =  IRI_ID"...hasSimpleResult"  ,  S = k_s_1_88_t13.O$698
            RDF_QUAD         1 rows(s_1_88_t19.S$158, s_1_88_t19.O$157)
             inlined  P =  IRI_ID"...hasSimpleResult"  ,  S = k_s_1_88_t14.O$709
            RDF_QUAD         1 rows(s_1_88_t34.S$162, s_1_88_t34.O$161)
             inlined  P =  IRI_ID"...observedProperty"  ,  S = k_s_1_88_t10.O$720
            RDF_QUAD         1 rows(s_1_88_t30.S$166, s_1_88_t30.O$165)
             inlined  P =  IRI_ID"...observedProperty"  ,  S = k_s_1_88_t11.O$731
            RDF_QUAD         1 rows(s_1_88_t26.S$170, s_1_88_t26.O$169)
             inlined  P =  IRI_ID"...observedProperty"  ,  S = k_s_1_88_t12.O$742
            RDF_QUAD         1 rows(s_1_88_t22.S$174, s_1_88_t22.O$173)
             inlined  P =  IRI_ID"...observedProperty"  ,  S = k_s_1_88_t13.O$753
            RDF_QUAD         1 rows(s_1_88_t18.S$178, s_1_88_t18.O$177)
             inlined  P =  IRI_ID"...observedProperty"  ,  S = k_s_1_88_t14.O$764
            RDF_QUAD_POGS   7.1e-05 rows()
             inlined  P =  IRI_ID"...BProjectID"  ,  O = k_DB.DBA.RDF_MAKE_OBJ_OF_TYPEDSQLVAL$775 ,  S = k_s_1_88_t1.O$779
            RDF_QUAD       0.8 rows()
             inlined  P =  IRI_ID"...type"  ,  S = k_s_1_88_t10.O$786 ,  O =  IRI_ID"...Observation" 
            RDF_QUAD       0.8 rows()
             inlined  P =  IRI_ID"...type"  ,  S = k_s_1_88_t11.O$795 ,  O =  IRI_ID"...Observation" 
            RDF_QUAD       0.8 rows()
             inlined  P =  IRI_ID"...type"  ,  S = k_s_1_88_t12.O$804 ,  O =  IRI_ID"...Observation" 
            RDF_QUAD       0.8 rows()
             inlined  P =  IRI_ID"...type"  ,  S = k_s_1_88_t13.O$813 ,  O =  IRI_ID"...Observation" 
            RDF_QUAD       0.8 rows()
             inlined  P =  IRI_ID"...type"  ,  S = k_s_1_88_t14.O$822 ,  O =  IRI_ID"...Observation" 
            RDF_QUAD         1 rows(s_1_88_t38.O$193)
             inlined  P =  IRI_ID"...observedProperty"  ,  S = k_s_1_88_t9.O$831
            RDF_QUAD         1 rows(s_1_88_t6.O$196)
             inlined  P =  IRI_ID"...year"  ,  S = k_s_1_88_t0.S$841
            Precode:
              0: __ro2sq$198 := Call __ro2sq (s_1_88_t8.O$93)
              5: __ro2sq$200 := Call __ro2sq (s_1_88_t7.O$101)
              10: __ro2sq$202 := Call __ro2sq (s_1_88_t4.O$81)
              15: __ro2sq$204 := Call __ro2sq (s_1_88_t3.O$85)
              20: __ro2sq$206 := Call __ro2sq (s_1_88_t1.O$75)
              25: __id2in$208 := Call __id2in (s_1_88_t0.S$71)
              30: __ro2sq$210 := Call __ro2sq (s_1_88_t18.O$177)
              35: __ro2sq$212 := Call __ro2sq (s_1_88_t19.O$157)
              40: __ro2sq$214 := Call __ro2sq (s_1_88_t17.O$137)
              45: __ro2sq$216 := Call __ro2sq (s_1_88_t22.O$173)
              50: __ro2sq$218 := Call __ro2sq (s_1_88_t23.O$153)
              55: __ro2sq$220 := Call __ro2sq (s_1_88_t21.O$133)
              60: __ro2sq$222 := Call __ro2sq (s_1_88_t26.O$169)
              65: __ro2sq$224 := Call __ro2sq (s_1_88_t27.O$149)
              70: __ro2sq$226 := Call __ro2sq (s_1_88_t25.O$129)
              75: __ro2sq$228 := Call __ro2sq (s_1_88_t30.O$165)
              80: __ro2sq$230 := Call __ro2sq (s_1_88_t31.O$145)
              85: __ro2sq$232 := Call __ro2sq (s_1_88_t29.O$125)
              90: __ro2sq$234 := Call __ro2sq (s_1_88_t34.O$161)
              95: __ro2sq$236 := Call __ro2sq (s_1_88_t35.O$141)
              100: __ro2sq$238 := Call __ro2sq (s_1_88_t33.O$121)
              105: __ro2sq$240 := Call __ro2sq (s_1_88_t38.O$193)
              110: __ro2sq$242 := Call __ro2sq (s_1_88_t39.O$57)
              115: __ro2sq$244 := Call __ro2sq (s_1_88_t37.O$117)
              120: __ro2sq$246 := Call __ro2sq (s_1_88_t6.O$196)
              125: BReturn 0
            RDF_QUAD   4.4e+04 rows(s_1_88_t43.S$249)
             inlined  P =  IRI_ID"...hasSimpleResult" 
            RDF_QUAD         1 rows(s_1_88_t41.S$252)
             inlined  P =  IRI_ID"...resultTime"  ,  S = s_1_88_t43.S$249
            RDF_QUAD         1 rows(s_1_88_t42.S$255)
             inlined  P =  IRI_ID"...observedProperty"  ,  S = s_1_88_t41.S$252
            RDF_QUAD       0.8 rows()
             inlined  P =  IRI_ID"...type"  ,  S = k_s_1_88_t41.S$869 ,  O =  IRI_ID"...Observation" 
            Sort (s_1_88_t8.O$93, s_1_88_t7.O$101, s_1_88_t6.O$196, s_1_88_t4.O$81, s_1_88_t3.O$85, s_1_88_t1.O$75, s_1_88_t0.S$71) -> (gb_tmp$262 , gb_tmp$265 , gb_tmp$268 , gb_tmp$271 , gb_tmp$274 , gb_tmp$277 , gb_tmp$280 , gb_tmp$283 , gb_tmp$286 , gb_tmp$289 , gb_tmp$292 , gb_tmp$295 , gb_tmp$298 , gb_tmp$301 , gb_tmp$304 , gb_tmp$307 , gb_tmp$310 , gb_tmp$313 , __ro2sq$198, __ro2sq$200, __ro2sq$246, __ro2sq$202, __ro2sq$204, __ro2sq$206, __id2in$208)
            user aggregate init
            0: ua_ret$263 := Call DB.DBA.GROUP_CONCAT_INIT (gb_tmp$262 )
            7: BReturn 0
             user aggregate accumulator
            0: ua_ret$263 := Call DB.DBA.GROUP_CONCAT_ACC (gb_tmp$262 , __ro2sq$244, <c ,>)
            7: BReturn 0

            user aggregate init
            0: ua_ret$266 := Call DB.DBA.GROUP_CONCAT_INIT (gb_tmp$265 )
            7: BReturn 0
             user aggregate accumulator
            0: ua_ret$266 := Call DB.DBA.GROUP_CONCAT_ACC (gb_tmp$265 , __ro2sq$242, <c ,>)
            7: BReturn 0

            user aggregate init
            0: ua_ret$269 := Call DB.DBA.GROUP_CONCAT_INIT (gb_tmp$268 )
            7: BReturn 0
             user aggregate accumulator
            0: ua_ret$269 := Call DB.DBA.GROUP_CONCAT_ACC (gb_tmp$268 , __ro2sq$240, <c ,>)
            7: BReturn 0

            user aggregate init
            0: ua_ret$272 := Call DB.DBA.GROUP_CONCAT_INIT (gb_tmp$271 )
            7: BReturn 0
             user aggregate accumulator
            0: ua_ret$272 := Call DB.DBA.GROUP_CONCAT_ACC (gb_tmp$271 , __ro2sq$238, <c ,>)
            7: BReturn 0

            user aggregate init
            0: ua_ret$275 := Call DB.DBA.GROUP_CONCAT_INIT (gb_tmp$274 )
            7: BReturn 0
             user aggregate accumulator
            0: ua_ret$275 := Call DB.DBA.GROUP_CONCAT_ACC (gb_tmp$274 , __ro2sq$236, <c ,>)
            7: BReturn 0

            user aggregate init
            0: ua_ret$278 := Call DB.DBA.GROUP_CONCAT_INIT (gb_tmp$277 )
            7: BReturn 0
             user aggregate accumulator
            0: ua_ret$278 := Call DB.DBA.GROUP_CONCAT_ACC (gb_tmp$277 , __ro2sq$234, <c ,>)
            7: BReturn 0

            user aggregate init
            0: ua_ret$281 := Call DB.DBA.GROUP_CONCAT_INIT (gb_tmp$280 )
            7: BReturn 0
             user aggregate accumulator
            0: ua_ret$281 := Call DB.DBA.GROUP_CONCAT_ACC (gb_tmp$280 , __ro2sq$232, <c ,>)
            7: BReturn 0

            user aggregate init
            0: ua_ret$284 := Call DB.DBA.GROUP_CONCAT_INIT (gb_tmp$283 )
            7: BReturn 0
             user aggregate accumulator
            0: ua_ret$284 := Call DB.DBA.GROUP_CONCAT_ACC (gb_tmp$283 , __ro2sq$230, <c ,>)
            7: BReturn 0

            user aggregate init
            0: ua_ret$287 := Call DB.DBA.GROUP_CONCAT_INIT (gb_tmp$286 )
            7: BReturn 0
             user aggregate accumulator
            0: ua_ret$287 := Call DB.DBA.GROUP_CONCAT_ACC (gb_tmp$286 , __ro2sq$228, <c ,>)
            7: BReturn 0

            user aggregate init
            0: ua_ret$290 := Call DB.DBA.GROUP_CONCAT_INIT (gb_tmp$289 )
            7: BReturn 0
             user aggregate accumulator
            0: ua_ret$290 := Call DB.DBA.GROUP_CONCAT_ACC (gb_tmp$289 , __ro2sq$226, <c ,>)
            7: BReturn 0

            user aggregate init
            0: ua_ret$293 := Call DB.DBA.GROUP_CONCAT_INIT (gb_tmp$292 )
            7: BReturn 0
             user aggregate accumulator
            0: ua_ret$293 := Call DB.DBA.GROUP_CONCAT_ACC (gb_tmp$292 , __ro2sq$224, <c ,>)
            7: BReturn 0

            user aggregate init
            0: ua_ret$296 := Call DB.DBA.GROUP_CONCAT_INIT (gb_tmp$295 )
            7: BReturn 0
             user aggregate accumulator
            0: ua_ret$296 := Call DB.DBA.GROUP_CONCAT_ACC (gb_tmp$295 , __ro2sq$222, <c ,>)
            7: BReturn 0

            user aggregate init
            0: ua_ret$299 := Call DB.DBA.GROUP_CONCAT_INIT (gb_tmp$298 )
            7: BReturn 0
             user aggregate accumulator
            0: ua_ret$299 := Call DB.DBA.GROUP_CONCAT_ACC (gb_tmp$298 , __ro2sq$220, <c ,>)
            7: BReturn 0

            user aggregate init
            0: ua_ret$302 := Call DB.DBA.GROUP_CONCAT_INIT (gb_tmp$301 )
            7: BReturn 0
             user aggregate accumulator
            0: ua_ret$302 := Call DB.DBA.GROUP_CONCAT_ACC (gb_tmp$301 , __ro2sq$218, <c ,>)
            7: BReturn 0

            user aggregate init
            0: ua_ret$305 := Call DB.DBA.GROUP_CONCAT_INIT (gb_tmp$304 )
            7: BReturn 0
             user aggregate accumulator
            0: ua_ret$305 := Call DB.DBA.GROUP_CONCAT_ACC (gb_tmp$304 , __ro2sq$216, <c ,>)
            7: BReturn 0

            user aggregate init
            0: ua_ret$308 := Call DB.DBA.GROUP_CONCAT_INIT (gb_tmp$307 )
            7: BReturn 0
             user aggregate accumulator
            0: ua_ret$308 := Call DB.DBA.GROUP_CONCAT_ACC (gb_tmp$307 , __ro2sq$214, <c ,>)
            7: BReturn 0

            user aggregate init
            0: ua_ret$311 := Call DB.DBA.GROUP_CONCAT_INIT (gb_tmp$310 )
            7: BReturn 0
             user aggregate accumulator
            0: ua_ret$311 := Call DB.DBA.GROUP_CONCAT_ACC (gb_tmp$310 , __ro2sq$212, <c ,>)
            7: BReturn 0

            user aggregate init
            0: ua_ret$314 := Call DB.DBA.GROUP_CONCAT_INIT (gb_tmp$313 )
            7: BReturn 0
             user aggregate accumulator
            0: ua_ret$314 := Call DB.DBA.GROUP_CONCAT_ACC (gb_tmp$313 , __ro2sq$210, <c ,>)
            7: BReturn 0

          }
        group by read node  
        (s_1_88_t8.O$93, s_1_88_t7.O$101, s_1_88_t6.O$196, s_1_88_t4.O$81, s_1_88_t3.O$85, s_1_88_t1.O$75, s_1_88_t0.S$71, aggregate$261, aggregate$264, aggregate$267, aggregate$270, aggregate$273, aggregate$276, aggregate$279, aggregate$282, aggregate$285, aggregate$288, aggregate$291, aggregate$294, aggregate$297, aggregate$300, aggregate$303, aggregate$306, aggregate$309, aggregate$312, __ro2sq$198, __ro2sq$200, __ro2sq$246, __ro2sq$202, __ro2sq$204, __ro2sq$206, __id2in$208)

        After code:
        0: DB.DBA.GROUP_CONCAT_FIN$331 := Call DB.DBA.GROUP_CONCAT_FIN (aggregate$312)
        7: DB.DBA.GROUP_CONCAT_FIN$333 := Call DB.DBA.GROUP_CONCAT_FIN (aggregate$309)
        14: DB.DBA.GROUP_CONCAT_FIN$335 := Call DB.DBA.GROUP_CONCAT_FIN (aggregate$306)
        21: DB.DBA.GROUP_CONCAT_FIN$337 := Call DB.DBA.GROUP_CONCAT_FIN (aggregate$303)
        28: DB.DBA.GROUP_CONCAT_FIN$339 := Call DB.DBA.GROUP_CONCAT_FIN (aggregate$300)
        35: DB.DBA.GROUP_CONCAT_FIN$341 := Call DB.DBA.GROUP_CONCAT_FIN (aggregate$297)
        42: DB.DBA.GROUP_CONCAT_FIN$343 := Call DB.DBA.GROUP_CONCAT_FIN (aggregate$294)
        49: DB.DBA.GROUP_CONCAT_FIN$345 := Call DB.DBA.GROUP_CONCAT_FIN (aggregate$291)
        56: DB.DBA.GROUP_CONCAT_FIN$347 := Call DB.DBA.GROUP_CONCAT_FIN (aggregate$288)
        63: DB.DBA.GROUP_CONCAT_FIN$349 := Call DB.DBA.GROUP_CONCAT_FIN (aggregate$285)
        70: DB.DBA.GROUP_CONCAT_FIN$351 := Call DB.DBA.GROUP_CONCAT_FIN (aggregate$282)
        77: DB.DBA.GROUP_CONCAT_FIN$353 := Call DB.DBA.GROUP_CONCAT_FIN (aggregate$279)
        84: DB.DBA.GROUP_CONCAT_FIN$355 := Call DB.DBA.GROUP_CONCAT_FIN (aggregate$276)
        91: DB.DBA.GROUP_CONCAT_FIN$357 := Call DB.DBA.GROUP_CONCAT_FIN (aggregate$273)
        98: DB.DBA.GROUP_CONCAT_FIN$359 := Call DB.DBA.GROUP_CONCAT_FIN (aggregate$270)
        105: DB.DBA.GROUP_CONCAT_FIN$361 := Call DB.DBA.GROUP_CONCAT_FIN (aggregate$267)
        112: DB.DBA.GROUP_CONCAT_FIN$363 := Call DB.DBA.GROUP_CONCAT_FIN (aggregate$264)
        119: DB.DBA.GROUP_CONCAT_FIN$365 := Call DB.DBA.GROUP_CONCAT_FIN (aggregate$261)
        126: bObs$30 :=  := artm __id2in$208
        130: bObsFoi$31 :=  := artm __ro2sq$206
        134: AvgF$32 :=  := artm __ro2sq$204
        138: CountObs$33 :=  := artm __ro2sq$202
        142: y$34 :=  := artm __ro2sq$246
        146: m$35 :=  := artm __ro2sq$200
        150: d$36 :=  := artm __ro2sq$198
        154: np_minus1$37 :=  := artm DB.DBA.GROUP_CONCAT_FIN$331
        158: nr_minus1$38 :=  := artm DB.DBA.GROUP_CONCAT_FIN$333
        162: nt_minus1$39 :=  := artm DB.DBA.GROUP_CONCAT_FIN$335
        166: np_minus2$40 :=  := artm DB.DBA.GROUP_CONCAT_FIN$337
        170: nr_minus2$41 :=  := artm DB.DBA.GROUP_CONCAT_FIN$339
        174: nt_minus2$42 :=  := artm DB.DBA.GROUP_CONCAT_FIN$341
        178: np_minus3$43 :=  := artm DB.DBA.GROUP_CONCAT_FIN$343
        182: nr_minus3$44 :=  := artm DB.DBA.GROUP_CONCAT_FIN$345
        186: nt_minus3$45 :=  := artm DB.DBA.GROUP_CONCAT_FIN$347
        190: np_minus4$46 :=  := artm DB.DBA.GROUP_CONCAT_FIN$349
        194: nr_minus4$47 :=  := artm DB.DBA.GROUP_CONCAT_FIN$351
        198: nt_minus4$48 :=  := artm DB.DBA.GROUP_CONCAT_FIN$353
        202: np_minus5$49 :=  := artm DB.DBA.GROUP_CONCAT_FIN$355
        206: nr_minus5$50 :=  := artm DB.DBA.GROUP_CONCAT_FIN$357
        210: nt_minus5$51 :=  := artm DB.DBA.GROUP_CONCAT_FIN$359
        214: np_minus6$52 :=  := artm DB.DBA.GROUP_CONCAT_FIN$361
        218: nr_minus6$53 :=  := artm DB.DBA.GROUP_CONCAT_FIN$363
        222: nt_minus6$54 :=  := artm DB.DBA.GROUP_CONCAT_FIN$365
        226: BReturn 0
        Subquery Select(bObs$30, bObsFoi$31, AvgF$32, CountObs$33, y$34, m$35, d$36, np_minus1$37, nr_minus1$38, nt_minus1$39, np_minus2$40, nr_minus2$41, nt_minus2$42, np_minus3$43, nr_minus3$44, nt_minus3$45, np_minus4$46, nr_minus4$47, nt_minus4$48, np_minus5$49, nr_minus5$50, nt_minus5$51, np_minus6$52, nr_minus6$53, nt_minus6$54)
      }

    After code:
    0: bObs$374 := Call __ro2sq (bObs$30)
    5: bObsFoi$376 := Call __ro2sq (bObsFoi$31)
    10: AvgF$378 := Call __ro2sq (AvgF$32)
    15: CountObs$380 := Call __ro2sq (CountObs$33)
    20: y$382 := Call __ro2sq (y$34)
    25: m$384 := Call __ro2sq (m$35)
    30: d$386 := Call __ro2sq (d$36)
    35: np_minus1$388 := Call __ro2sq (np_minus1$37)
    40: nr_minus1$390 := Call __ro2sq (nr_minus1$38)
    45: nt_minus1$392 := Call __ro2sq (nt_minus1$39)
    50: np_minus2$394 := Call __ro2sq (np_minus2$40)
    55: nr_minus2$396 := Call __ro2sq (nr_minus2$41)
    60: nt_minus2$398 := Call __ro2sq (nt_minus2$42)
    65: np_minus3$400 := Call __ro2sq (np_minus3$43)
    70: nr_minus3$402 := Call __ro2sq (nr_minus3$44)
    75: nt_minus3$404 := Call __ro2sq (nt_minus3$45)
    80: np_minus4$406 := Call __ro2sq (np_minus4$46)
    85: nr_minus4$408 := Call __ro2sq (nr_minus4$47)
    90: nt_minus4$410 := Call __ro2sq (nt_minus4$48)
    95: np_minus5$412 := Call __ro2sq (np_minus5$49)
    100: nr_minus5$414 := Call __ro2sq (nr_minus5$50)
    105: nt_minus5$416 := Call __ro2sq (nt_minus5$51)
    110: np_minus6$418 := Call __ro2sq (np_minus6$52)
    115: nr_minus6$420 := Call __ro2sq (nr_minus6$53)
    120: nt_minus6$422 := Call __ro2sq (nt_minus6$54)
    125: BReturn 0
    Select (bObs$374, bObsFoi$376, AvgF$378, CountObs$380, y$382, m$384, d$386, np_minus1$388, nr_minus1$390, nt_minus1$392, np_minus2$394, nr_minus2$396, nt_minus2$398, np_minus3$400, nr_minus3$402, nt_minus3$404, np_minus4$406, nr_minus4$408, nt_minus4$410, np_minus5$412, nr_minus5$414, nt_minus5$416, np_minus6$418, nr_minus6$420, nt_minus6$422)
  }

331 Rows. -- 582 msec.
HughWilliams commented 2 years ago

We have downloaded and loaded your data into graph <urn:git1045:data> in our public https://geosparql.demo.openlinksw.com/sparql instance, and can run the test query you provided directly against the SPARQL endpoint.

So we don't need the python code, as we can see the problems directly via the SPARQL endpoint, and are looking into this ...