Open rburghol opened 2 months ago
@rburghol I think this workflow makes sense to me but doesn't st_intersection
only return partial cells along the boundary? It vectorizes the raster and then performs a true intersection which makes me think it will, which in turn may then prevent our mask from working along the boundary still. Do we need to then include a buffer on that mask? The rest of the workflow seems solid to me.
Doing a little research into st_intersection
, I stumbled upon this article. This was very informative and makes me think maybe we should have leveraged tiles and a gist index when importing our rasters. By the logic in this article, our PRISM imports could have been tiled to 16x16 to preserve 2048 bytes per tile and maintain the db page size. This may have sped up the performance of our tiling experiments. The exercise in the article is fairly similar to what we are trying to do on a spatial scale, but on a smaller temporal scale. I think this may be worth looking into and I will try to tackle it if I can free some time up in the next few days (unless the mask solution works, in which case it may not be worth it)!
@COBrogan I see what you're saying and it makes sense. Perhaps we can use the cell height and width queries to determine the number of cells and then manually create a bounding box?
I've actually already explored tiles in issue #70 -- there was some important performance improvement, though my recollection is vague right now, but I believe that we have like a year of NLDAS2 imported that we can use for testing purposes. If you want to tinker around with it, I think #70 should have some reasonable starting code.
Need to test methods for coverage raster summary:
st_clip
: returns only cells whose centroid is inside of coverage (40 years in ~21 minutes for large basin, less than 10 mins for small basin)st_intersection
: returns cells that overlap coverage, but slow-- 1.5x slower on single hour record, but approximately 15x slower on 40 year hourly record (5 hours for 40 year hourly record on large basin)map_algebra
: can we isolate the cells of interest (with st_intersection), then use these as a mask to multiply?Map Algebra optimization
@COBrogan I believe that I have come up with a potential strategy to exploit the speed of
st_clip
with the desired coverage ofst_intersection
.Create a raster "mask"
Create a raster with 1.0 value in each cell that overlaps the coverage
WITH
orUSING
clause?st_intersection
, which is a collection of vectorsst_envelope
)st_clip
with the bounding box, which should return the same raster cells asst_intersection
, but in raster form.ST_AddBand(ST_MakeEmptyRaster(rast), 1, ST_BandPixelType(rast), 1.0)
Use
st_mapalgebra
to multiply the mask raster by the met data raster to extract points of interestUsing code from calc_raster_ts as baseline
st_clip()
:COPY 385728 -- Time: 1275777.679 ms (21:15.778)
st_intersection()
: approximate time 5 hours (done by comparing file timestamp on dbase2, since ssh session died during query but query completed)Standard
st_clip
methodCOPY 385728
Time: 1275777.679 ms (21:15.778)
Explore alternative calc_raster_ts with
st_intersection
instead ofst_clip
Now some real values
select st_astext( (ST_PixelAsPolygons(st_clip(met.rast, fgeo.dh_geofield_geom),1, FALSE)).geom), (ST_PixelAsPolygons(st_clip(met.rast, fgeo.dh_geofield_geom),1, FALSE)).x, (ST_PixelAsPolygons(st_clip(met.rast, fgeo.dh_geofield_geom),1, FALSE)).y, (ST_PixelAsPolygons(st_clip(met.rast, fgeo.dh_geofield_geom),1, FALSE)).val from dh_feature as f left outer join field_data_dh_geofield as fgeo on ( fgeo.entity_id = f.hydroid and fgeo.entity_type = 'dh_feature' ) left outer join dh_variabledefinition as v on ( v.varkey = :'varkey' ) left outer join dh_feature as mcov on ( mcov.hydrocode = 'cbp6_met_coverage' ) left outer join dh_timeseries_weather as met on ( mcov.hydroid = met.featureid and met.varid = v.hydroid ) where f.hydrocode = :'hydrocode' and tid = 110377526 ;
select st_astext( (ST_PixelAsPolygons(st_clip(met.rast, fgeo.dh_geofield_geom),1)).geom), (ST_PixelAsPolygons(st_clip(met.rast, fgeo.dh_geofield_geom, FALSE),1)).x, (ST_PixelAsPolygons(st_clip(met.rast, fgeo.dh_geofield_geom, FALSE),1)).y, (ST_PixelAsPolygons(st_clip(met.rast, fgeo.dh_geofield_geom, FALSE),1)).val from dh_feature as f left outer join field_data_dh_geofield as fgeo on ( fgeo.entity_id = f.hydroid and fgeo.entity_type = 'dh_feature' ) left outer join dh_variabledefinition as v on ( v.varkey = :'varkey' ) left outer join dh_feature as mcov on ( mcov.hydrocode = 'cbp6_met_coverage' ) left outer join dh_timeseries_weather as met on ( mcov.hydroid = met.featureid and met.varid = v.hydroid ) where f.hydrocode = :'hydrocode' and tid = 110377526 ;
-------------------------------------------------------------------------------------------------+---+----+-------- POLYGON((-82.7505 37.1255,-82.6255 37.1255,-82.6255 37.0005,-82.7505 37.0005,-82.7505 37.1255)) | 8 | 56 | 0.0032 (1 row)
select st_astext( st_centroid((ST_Intersection(met.rast, fgeo.dh_geofield_geom)).geom)), (ST_Intersection(met.rast, fgeo.dh_geofield_geom)).val from dh_feature as f left outer join field_data_dh_geofield as fgeo on ( fgeo.entity_id = f.hydroid and fgeo.entity_type = 'dh_feature' ) left outer join dh_variabledefinition as v on ( v.varkey = :'varkey' ) left outer join dh_feature as mcov on ( mcov.hydrocode = 'cbp6_met_coverage' ) left outer join dh_timeseries_weather as met on ( mcov.hydroid = met.featureid and met.varid = v.hydroid ) where f.hydrocode = :'hydrocode' and tid = 110377526 ;
---------------------------------------------+----------------------- POINT(-82.65823887889381 37.13428596880967) | 0.00559999980032444 POINT(-82.67664570483737 37.08666204286433) | 0.0031999999191612005 POINT(-82.6173913631824 37.096629058928045) | 0 (3 rows)
select f.hydrocode, (ST_PixelAsPolygons(st_clip(met.rast, fgeo.dh_geofield_geom, 9999, FALSE),1)).x, (ST_PixelAsPolygons(st_clip(met.rast, fgeo.dh_geofield_geom, 9999, FALSE),1)).y, (ST_PixelAsPolygons(st_clip(met.rast, fgeo.dh_geofield_geom, 9999, FALSE),1)).val from dh_feature as f left outer join field_data_dh_geofield as fgeo on ( fgeo.entity_id = f.hydroid and fgeo.entity_type = 'dh_feature' ) left outer join raster_templates as met on ( met.varkey = :'varkey' ) where f.hydrocode = :'hydrocode';
create temp table rastest as SELECT 'nldas2_obs_hourly' as varkey, tid, to_timestamp(tsendtime), tsendtime, ST_AddBand(ST_MakeEmptyRaster(rast), 1, ST_BandPixelType(rast), 0.0) as rast FROM dh_timeseries_weather WHERE featureid in ( select hydroid from dh_feature where hydrocode = 'cbp6_met_coverage' ) and varid in (select hydroid from dh_variabledefinition where varkey like 'nldas2_obs_hourly') and rast is not null limit 1;
select met.varkey, st_width(st_clip(met.rast, fgeo.dh_geofield_geom)), (ST_summarystats(st_clip(met.rast, fgeo.dh_geofield_geom), :band, TRUE)).count as cells from dh_feature as f left outer join field_data_dh_geofield as fgeo on ( fgeo.entity_id = f.hydroid and fgeo.entity_type = 'dh_feature' ) left outer join rastest as met on ( met.varkey = :'varkey' ) where f.hydrocode = :'hydrocode';