Open rburghol opened 3 weeks ago
TRy new method without raster calc which is unneccessary?
create temp table raster_best(tid serial, featureid integer, tstime bigint, tsendtime bigint, varid integer, rast raster);
insert into raster_best(featureid, tstime, tsendtime, varid, rast)
select f.hydroid, w.tstime, w.tsendtime, w.varid,
ST_Resample(st_clip(w.rast, fgeo.dh_geofield_geom), tpr.rast) as rast
from dh_feature as f
left outer join field_data_dh_geofield as fgeo
on (
fgeo.entity_id = f.hydroid
and fgeo.entity_type = 'dh_feature'
)
left outer join week_ratings as r
on (
yr = 1984
)
left outer join dh_variabledefinition as v
on (
v.varkey = r.best_varkey
)
left outer join dh_timeseries_weather as w
on (
extract( year from to_timestamp(w.tsendtime) ) = r.yr
and extract( month from to_timestamp(w.tsendtime) ) = r.mo
and extract( week from to_timestamp(w.tsendtime) ) = r.wk
and v.hydroid = w.varid
)
left outer join raster_templates as tpr
on (1=1)
where f.hydrocode = 'usgs_ws_01665500'
and w.tid is not null;
PGHOST=dbase2 PGPORT=5432 PGUSER=rob gdal_translate -of GTiff "PG:dbname=drupal.dh03 table=raster_best_out where='tid = 1 ' " outfile.tiff
@ilonah22 @mwdunlap2004
Rscript.exe
and passing arguments and reading arguments inside the command prompt. See ex: https://github.com/HARPgroup/meta_model/blob/main/scripts/river/hsp_hydr_analysis.RTrying to track down the issue with NLDAS not generating precip in the command:
/opt/model/meta_model/run_model raster_met PRISM usgs_ws_01660400 auto geo download
The error keeps saying that all pixels in the area have the NoData value. NOTICE: All pixels of band have the NODATA value
. But, this does not appear to be the case. If we look at a specific point within the watershed on a day in which we know there is rain in Aquia Creek for usgs_ws_01660400, we can see that NLDAS should have data for this watershed based on the full NLDAS rasters for that day. However, when we clip the raster to just the gage, the clipped raster no longer has data:
featureid | obs_date | mo | fullrastpoint | cliprastpoint |
---|---|---|---|---|
617850 | 2023-01-16 23:00:00-05 | 1 | 0 | |
617850 | 2023-01-17 00:00:00-05 | 1 | 0 | |
617850 | 2023-01-17 01:00:00-05 | 1 | 0 | |
617850 | 2023-01-17 02:00:00-05 | 1 | 0 | |
617850 | 2023-01-17 03:00:00-05 | 1 | 0 | |
617850 | 2023-01-17 04:00:00-05 | 1 | 0.0222 | |
617850 | 2023-01-17 05:00:00-05 | 1 | 0.0428 | |
617850 | 2023-01-17 06:00:00-05 | 1 | 0.1352 |
This is NOT the case for Strausberg. At hydrocode usgs_ws_01634000, we can see that a specific point in the watershed shows equal rainfall in both the full and clipped rasters. This makes me think something has gone wrong with our ST_clip command, but I haven't tracked it down yet
featureid | obs_date | mo | fullrastpoint | cliprastpoint |
---|---|---|---|---|
617850 | 2023-01-16 23:00:00-05 | 1 | 0 | 0 |
617850 | 2023-01-17 00:00:00-05 | 1 | 0 | 0 |
617850 | 2023-01-17 01:00:00-05 | 1 | 0.0002 | 0.0002 |
617850 | 2023-01-17 02:00:00-05 | 1 | 0.2112 | 0.2112 |
617850 | 2023-01-17 03:00:00-05 | 1 | 0.3976 | 0.3976 |
617850 | 2023-01-17 04:00:00-05 | 1 | 0.4244 | 0.4244 |
617850 | 2023-01-17 05:00:00-05 | 1 | 0.212 | 0.212 |
617850 | 2023-01-17 06:00:00-05 | 1 | 0.2228 | 0.2228 |
It's clear that our clip at the very least generates a raster of the shape that we'd generally expect:
POLYGON((-77.6255 38.6255,-77.3755 38.6255,-77.3755 38.3755,-77.6255 38.3755,-77.6255 38.6255))
With some more research, my guess is that ST_clip
doesn't grab partial cells from the raster - that is, it only grabs cells that are contained in the file or at least that the centroid is in. See here. My gage is in between several cells. The below image shows only the extent of the clipped raster from above on January 17th and shows that our site doesn't contain any full pixels:
If I use ST_intersection
I start to see values at our site. I imagine I would get similar results with ST_DumpAsPolygons
but I'm guessing that will be computationally slow. ST_intersection
is already pretty slow based on my testing:
featureid | obs_date | mo | clipval |
---|---|---|---|
617850 | 2023-01-16 23:00:00-05 | 1 | 0 |
617850 | 2023-01-17 00:00:00-05 | 1 | 0 |
617850 | 2023-01-17 01:00:00-05 | 1 | 0 |
617850 | 2023-01-17 02:00:00-05 | 1 | 0 |
617850 | 2023-01-17 03:00:00-05 | 1 | 0 |
617850 | 2023-01-17 04:00:00-05 | 1 | 0.1597999930381775 |
617850 | 2023-01-17 04:00:00-05 | 1 | 0.04280000180006027 |
617850 | 2023-01-17 04:00:00-05 | 1 | 0.07980000227689743 |
617850 | 2023-01-17 04:00:00-05 | 1 | 0.022199999541044235 |
617850 | 2023-01-17 05:00:00-05 | 1 | 0.20419999957084656 |
617850 | 2023-01-17 05:00:00-05 | 1 | 0.06920000165700912 |
617850 | 2023-01-17 05:00:00-05 | 1 | 0.09019999951124191 |
617850 | 2023-01-17 05:00:00-05 | 1 | 0.04280000180006027 |
617850 | 2023-01-17 06:00:00-05 | 1 | 0.42640000581741333 |
@Cobrogan the docs on st_clip, to my read (I could be wrong) suggest that intersection is implicit in st_clip: https://postgis.net/docs/RT_ST_Clip.html
Either way, we can control that behavior in how we call st_clip, though still, the st_clip vs st_intersects could be an important distinction, but I am troubled by the image you showed.
Since the Culpepper gage contains the Ruckersville gage, I would expect there to be many more cells overlapping than what you're showing -- I might just be getting confused by the image blocks tho.
That said, at the every least I would expect the Culpepper gage to be less likely than the Ruckersville to have no data problems. I suspect maybe I goofed something up in the query during some recent tidy sessions.
Which makes me wonder: what do the daymet
and prism
containments look like?
@rburghol I'll draft up some figures showing daymet
and prism
containments. I've tried messing around with ST_clip
, but setting crop
to TRUE
or FALSE
yield the same results. My understanding is it just affects the extent of the raster that is returned, with "NoData" value being used to populate cells outside of the intersection. ST_clip
is the more efficient version of ST_intersection
, but I still haven't found how to get the bordering cells to be included.
Interestingly, resampling the raster to the PRISM resolution also yielded no results. So, these figures showing the containments should help identify if something is going wrong in ST_clip or if we are facing a resolution problem:
OK - resampling is a good idea, but I am still unclear why this is happening at Culpepper but not Ruckersville. Wonder if something is wrong with the geometry? Like self-intersection or something? The intersection algorithm is very inclusive, it literally just says there is some overlap.
Oh, one question: does "fullrasterpoint" actually just mean a single raster cell value intersecting the point you specified?
Plus @COBrogan I really think it is impossible that Culpepper contains no full pixels -- it's a drainage area of 468sqmi, it has to contain some 15x15km cells... maybe our projection is off, below is USGS map:
Well that's one mystery solved. I've been looking at 01660400, which is Aquia Creek Near Garrisonville. And has a DA of only 35 sq mi. See here for its NWIS page. So it is NOT the Culpepper gage that is giving us trouble, but something much smaller. My bad on the Wild Goose chase there.
As for the potential conflicts in ST_clip for smaller watersheds, see below. Dashed blue is daymet, red is PRISM, and solid black is NLDAS. We can see that there are no NLDAS cells fully contained in the watershed so ST_clip() will not return values for NLDAS.
OK cool, thanks for sorting that out!! @COBrogan
The thing is though, fully contained is not the way it is supposed to be happening. Intersect literally just means they're boundaries cross.
But, re-sampling to the smallest grid we have should do the trick.
Still troubling, however, is that @ilonah22 reported getting no data values for Fredericksburg…
I agree with your assertion, @rburghol . Maybe something else IS going on. First, while I've seen other users on SO with the same problem, I've yet to see official documentation of ST_clip() that implies this is the intended behavior. Second, the resampling failed to bring results. And, switching to daymet below, my query is returns logical results which indicates that my testing method works:
featureid | obs_date | mo | fullrastpoint | cliprastpoint |
---|---|---|---|---|
617850 | 2023-01-16 07:00:00-05 | 1 | 1.0297999382019043 | 1.0297999382019043 |
@rburghol ......I was using an incorrect set of coordinates. Resampling the NLDAS raster to PRISM results in me getting the same results at a given point. WIthout resampling, I get no data.
featureid | obs_date | mo | fullrastpoint | resamplecliprast | noresamplecliprast |
---|---|---|---|---|---|
617850 | 2023-01-16 23:00:00-05 | 1 | 0 | 0 | |
617850 | 2023-01-17 00:00:00-05 | 1 | 0 | 0 | |
617850 | 2023-01-17 01:00:00-05 | 1 | 0 | 0 | |
617850 | 2023-01-17 02:00:00-05 | 1 | 0 | 0 | |
617850 | 2023-01-17 03:00:00-05 | 1 | 0 | 0 | |
617850 | 2023-01-17 04:00:00-05 | 1 | 0.0222 | 0.0222 | |
617850 | 2023-01-17 05:00:00-05 | 1 | 0.0428 | 0.0428 | |
617850 | 2023-01-17 06:00:00-05 | 1 | 0.1352 | 0.1352 |
Taking this a step further, we can successfully get mean precipitation from the resampled raster:
featureid | obs_date | mo | precip_kgm3 |
---|---|---|---|
617850 | 2023-01-17 00:00:00-05 | 1 | 0 |
617850 | 2023-01-17 01:00:00-05 | 1 | 0 |
617850 | 2023-01-17 02:00:00-05 | 1 | 0 |
617850 | 2023-01-17 03:00:00-05 | 1 | 0 |
617850 | 2023-01-17 04:00:00-05 | 1 | 0.07275999999999999 |
617850 | 2023-01-17 05:00:00-05 | 1 | 0.09404 |
617850 | 2023-01-17 06:00:00-05 | 1 | 0.22464 |
617850 | 2023-01-17 07:00:00-05 | 1 | 0.4828799999999999 |
617850 | 2023-01-17 08:00:00-05 | 1 | 0.6930399999999999 |
Looking at a second watershed that has previously given us trouble brings us to usgs_ws_01669000 PISCATAWAY CREEK NEAR TAPPAHANNOCK, VA:
Once again, our typical NLDAS summary method using ST_clip
returns no data. But, it does as soon as we resample it to the PRISM resolution:
featureid | obs_date | mo | resamplemeanprecip | noresamplemeanprecip |
---|---|---|---|---|
617850 | 2023-01-17 05:00:00-05 | 1 | 0 | |
617850 | 2023-01-17 06:00:00-05 | 1 | 0.012 | |
617850 | 2023-01-17 07:00:00-05 | 1 | 0 | |
617850 | 2023-01-17 08:00:00-05 | 1 | 0.1956 | |
617850 | 2023-01-17 09:00:00-05 | 1 | 0.03 | |
617850 | 2023-01-17 10:00:00-05 | 1 | 0.884 | |
617850 | 2023-01-17 11:00:00-05 | 1 | 0.559 | |
617850 | 2023-01-17 12:00:00-05 | 1 | 0.0688 | |
617850 | 2023-01-17 13:00:00-05 | 1 | 0.1256 |
Awesome @COBrogan -- that looks really encouraging!! Also we have to resample, anyway, so this is doubly great.
I am curious, though, could you run your NLDS to query again and include a SummaryStatsAgg(st_clip(...).count
function? I want to see if it is returning 0 points, meaning the intersection doesn't work as I think it should, or if it is the combination of null and non-null values in an intersection that is resolving as null.
Sure, rerunning the above query for Piscataway Creek using count, I get the same warning that NOTICE: All pixels of band have the NODATA value
. Looking at the results below, we see the count for NLDAS prior to resampling is 0. Looking at the above figure, the 4 count for PRISM also makes it seem like ST_Clip is only getting cells that have a center point that intersects with the boundary.
featureid | obs_date | mo | resamplemeanprecip | noresamplemeanprecip |
---|---|---|---|---|
617850 | 2023-01-15 23:00:00-05 | 1 | 4 | 0 |
617850 | 2023-01-16 00:00:00-05 | 1 | 4 | 0 |
617850 | 2023-01-16 01:00:00-05 | 1 | 4 | 0 |
617850 | 2023-01-16 02:00:00-05 | 1 | 4 | 0 |
One potential fix could be convert the rasters into polygons that represent the cell size and resolution. We would only need one copy of this polygon file for each data source. We then run all intersections/clips on the polygons and then use the extent of the polygon intersection as a mask with which to clip our rasters. I'm going to see if this gives us more logical results.
I ran into an error message this morning while trying to run access-file.R and was hoping somebody could tell me what's going wrong.
I get this error message when I try to run the line that pulls the data in from VA Hydro:
cannot open the connection to 'http://deq1.bse.vt.edu:81/files/met/usgs_ws_03176500-prism-all.csv'
In addition: Warning message:
In file(file, "rt") :
cannot open URL 'http://deq1.bse.vt.edu:81/files/met/usgs_ws_03176500-prism-all.csv': HTTP status was '404 Not Found'
I also tried daymet but got the same issue.
Hey @ilonah22 sorry to delay response. Last week we shifted the output files to a different directory, so we need to change the URL references to the following:
http://deq1.bse.vt.edu:81/met/[data source]/out/
http://deq1.bse.vt.edu:81/met/daymet/out/
http://deq1.bse.vt.edu:81/met/PRISM/out/
http://deq1.bse.vt.edu:81/met/nldas2/out/
See here for more details: https://github.com/HARPgroup/HARParchive/issues/1291
Overview
lm()
of $Q{week} = f(P{week,month})$lm()
created to compare weekly mean P with weekly mean QWork flow steps
Prep Variables
Quick Run
cd /backup/meteorology
/opt/model/meta_model/run_model raster_met PRISM usgs_ws_02031000 auto geo download
Run steps
/opt/model/meta_model/run_model raster_met PRISM usgs_ws_02031000 auto geo download
calc_raster_ts
calc_raster_ts [hydrocode] [varkey] [outfile] [dbhost] [dbname] [remove sql file=y/n]
/opt/model/model_meteorology/sh/calc_raster_ts usgs_ws_02031000 prism_mod_daily "/tmp/usgs_ws_02031000-prism-all.csv" dbase2 drupal.dh03
$NLDAS_ROOT/[datasource]/out/[hydrocode]-[datasource]-all.csv
$NLDAS_ROOT/[datasource]/out/[hydrocode]-[datasource]-all.csv
$NLDAS_ROOT/[datasource]/out/[hydrocode]-[datasource]-all.csv
$NLDAS_ROOT/[datasource]/out/[hydrocode]-[datasource]-weekly.csv
run_model raster_met PRISM "usgs_ws_02031000" auto geo analyze 01
sftp dbase2:/tmp/usgs_ws_02031000-prism-all.csv /var/www/html/files/met/usgs_ws_02031000-prism-all.csv
Generate Weekly Ratings
lm()
predictions versus USGS observedlm()
predictions versus USGS observedcalc_raster_ts
)Import Week Ratings
Create Temp Table
copy week_ratings from '/tmp/met_weekly_ratings_01665500.csv' with HEADER CSV;
alter table week_ratings add column best_varid integer; alter table week_ratings add column best_varkey varchar(64);
update week_ratings set best_varkey = CASE WHEN best_method = 'prism' THEN 'prism_mod_daily' WHEN best_method = 'daymet' THEN 'daymet_mod_daily' ELSE 'do_not_overwrite_nldas2_obs_hourly' END;
--use an existing raster as template for new raster -- adds a single band, with value 0.0 in each pixel using the band type as defined by the daymet raster dataset create table raster_templates(rid serial, rast raster); INSERT INTO raster_templates(rast) SELECT ST_AddBand(ST_MakeEmptyRaster(rast), 1, ST_BandPixelType(rast), 0.0) FROM dh_timeseries_weather WHERE featureid in ( select hydroid from dh_feature where hydrocode = 'cbp6_met_coverage' ) and varid in (select hydroid from dh_variabledefinition where varkey like 'daymet_mod_daily') and rast is not null limit 1;
select st_astext(st_envelope(rast)) from raster_templates; -- POLYGON((-83.66607521306385 44.095589132351506,-73.52855441054511 44.095589132351506,-73.52855441054511 33.95806832983277,-83.66607521306385 33.95806832983277,-83.66607521306385 44.095589132351506))
create temp table raster_best(tid serial, featureid integer, tstime bigint, tsendtime bigint, varid integer, rast raster); insert into raster_best(featureid, tstime, tsendtime, varid, rast) select f.hydroid, w.tstime, w.tsendtime, w.varid, st_mapalgebra(tpr.rast, 1, ST_Resample(st_clip(w.rast, fgeo.dh_geofield_geom), tpr.rast), 1, '[rast] + [rast2]') as rast from dh_feature as f left outer join field_data_dh_geofield as fgeo on ( fgeo.entity_id = f.hydroid and fgeo.entity_type = 'dh_feature' ) left outer join week_ratings as r on ( yr = 1984 ) left outer join dh_variabledefinition as v on ( v.varkey = r.best_varkey ) left outer join dh_timeseries_weather as w on ( extract( year from to_timestamp(w.tsendtime) ) = r.yr and extract( month from to_timestamp(w.tsendtime) ) = r.mo and extract( week from to_timestamp(w.tsendtime) ) = r.wk and v.hydroid = w.varid ) left outer join raster_templates as tpr on (1=1) where f.hydrocode = 'usgs_ws_01665500' and w.tid is not null;
PGHOST=dbase2 PGPORT=5432 PGUSER=rob gdal_translate -of GTiff "PG:dbname=drupal.dh03 table=raster_best_out where='tid = 1 ' " outfile.tiff
SET postgis.gdal_enabled_drivers = 'ENABLE_ALL'; select tid, lowrite(lo_open(tid, 131072), tiff) As num_bytes FROM ( VALUES (lo_create(0), ST_Astiff( (SELECT rast FROM raster_best where tid = 1)) ) ) As v(tid,tiff);
tid | num_bytes ---------+----------- 9155256 | 4002 (1 row)
\lo_export 9155256 '/tmp/myraster.tiff' SELECT lo_unlink(9155256 );