Do resample mode - Githubissues

rburghol commented 1 week ago

resampled_raster_ts
- code: https://github.com/HARPgroup/model_meteorology/blob/main/sh/resampled_raster_ts
- based on: https://github.com/HARPgroup/model_meteorology/blob/main/sh/calc_raster_ts
- adds argument resample_varkey to indicate the raster to use from raster_templates

Testing

sh/calc_raster_ts L51079 nldas2_obs_hourly "/tmp/usgs_ws_02031000-prism-all.csv" dbase2 drupal.dh03
sh/resampled_raster_ts L51079 cbp6_landseg nldas2_obs_hourly daymet_mod_daily "/tmp/L51079-nldas-resam.sql" "/tmp/L51079-nldas-resam.csv" dbase2 drupal.dh03

Use the workflow for N51660 (NF Shen with NA at NLDAS2 resolution)


i=N51660; met_scenario="nldas2_resampled"
sbatch /opt/model/meta_model/run_model raster_met $met_scenario $i auto wdm

- Try the tiled workflow for N51660 (NF Shen with NA at NLDAS2 resolution)
   - time ~ 2 hours
   - url: http://deq1.bse.vt.edu:81/met/nldas2_resamptile/precip/N51660-nldas2-all.csv
   - file: /media/model/met/nldas2_resamptile/precip/N51660-nldas2-all.csv

i=N51660; met_scenario="nldas2_resamptile" sbatch /opt/model/meta_model/run_model raster_met $met_scenario $i auto wdm



**Image 1:** Compare daily precip from nldas2 from original CBP method (`met2date` scenario) with `nldas2_resamptile`.

![image](https://github.com/user-attachments/assets/e61b4a34-e2ef-4d9b-8451-04cf47194876)

**Image 2:** compare daily precip from original CPP method with the `ST_clip` method. note: this land segment is N51101, which differs from the above, so this is not a direct comparison of the three methods. TBD: repeat for N51660.
![image](https://github.com/user-attachments/assets/d6dad9b2-3498-42f9-93ff-f76a881f5df8)

rburghol commented 1 week ago

@COBrogan this is def an area that needs debugging. The 2 alternatives I tested yesterday are below, but neither work. The tiled one gives no data/no csv (2 gigs of error messages in log!) and the non-tiled yields a CSV with null values in every timestep.

use non-tiled: sbatch /opt/model/meta_model/run_model raster_met nldas2_resampled N51660 auto wdm auto wdm
use tiled: sbatch /opt/model/meta_model/run_model raster_met nldas2_resamptile N51660 auto wdm auto wdm

I could use some help with this. May be a good approach would be to set the date range to be very narrow in the .con file to debug quicker. Who knows maybe there's just some flaw in my query that I'm not seeing. The Aquarius generated in the script, LinkedIn the top of this issue, which was just based on your calc_raster_ts.

rburghol commented 1 week ago

More debugging @COBrogan -- brought the time down to 2 hours with the tiled dataset (I had forgotten to filter out tiles that did NOT overlap). Added in the && condition, tho it did not really improve performance a ton.

i=N51660; met_scenario="nldas2_resamptile"
sbatch /opt/model/meta_model/run_model raster_met $met_scenario $i tmp/scratch/resam wdm download 02_db_resample

cd /opt/model/p6/vadeq
. hspf_config
# set needed environment vars
MODEL_ROOT=/backup/meteorology/
SCRIPT_DIR=/opt/model/model_meteorology/sh
export MODEL_ROOT SCRIPT_DIR

query:

\set band '1'
 \set ftype 'cbp6_landseg'
 \set varkey 'nldas2_precip_hourly_tiled'
 \set resample_varkey 'daymet_mod_daily'
 \set hydrocode 'N51660'
 \set fname '/tmp/N51660-nldas2-all.csv'
 \timing ON

rburghol commented 1 week ago

@COBrogan I'm going to kick off a large batch of WDM creation with this re-sample technique today unless that is going to consume too many resources and get in your way. If you check out the images in the body of this issue, you can see that resampling appeared to have more differences for certain individual days than the difference between clipping and the CBP overlap with NLDAS2 (note those images are different land segments so it's not an apples to apples comparison, but it will be shortly!). Super curious to see what that does to performance.

I will be checking this issue so if you feel like you need the CPU cycles, let me know and I will cancel the batch, or you can do so at your convenience (in case you've never used it scancel is the command for slurm jobs).

rburghol commented 1 week ago

Test with entire Rapidan River:

basin=RU4_6040_6030
met_scenario="nldas2_resampled"
- tried nldas2_resamptile, which worked, but needs adjustments to to a st_union for multi tile summaries
To run see section "Full run example (using slurm)" in #72
All segs: H51113 N51047 N51113 H51079 L51079 N51079 N51003 N51137 N51177
Problem segments:
- N51177
- N51003
- L51079

Image 1: N51177 coverage overlap with nldas2 (boxes) and prism (noaa)

Debugging

 cp /tmp/N51177_1725800454_29331/N51177-nldas2-all.csv.sql ./
# change date range to short period manually with nano N51177-nldas2-all.csv.sql
cat N51177-nldas2-all.csv.sql | psql -h dbase2 drupal.dh03

\set band '1'
 \set ftype 'cbp6_landseg'
 \set varkey 'nldas2_precip_hourly_tiled'
 \set resample_varkey 'daymet_mod_daily'
 \set hydrocode 'N51177'
 \set fname '/tmp/N51177-nldas2-all.csv'
 \set start_epoch 441777600
 \set end_epoch 1704085199
 select hydroid as met_varid from dh_variabledefinition where varkey = :'varkey' \gset
 select hydroid as fid from dh_feature where hydrocode = :'hydrocode' and ftype = :'ftype' \gset
 select hydroid as covid from dh_feature where hydrocode = 'cbp6_met_coverage' \gset
 \timing ON
 copy ( select met.featureid, to_timestamp(met.tsendtime) as obs_date, met.tstime, met.tsendtime, extract(year from to_timestamp(met.tsendtime)) as yr, extract(month from to_timestamp(met.tsendtime)) as mo, extract(day from to_timestamp(met.tsendtime)) as da, extract(hour from to_timestamp(met.tsendtime)) as hr, (ST_summarystats(st_clip(met.rast, fgeo.dh_geofield_geom), 1, TRUE)).mean as precip_mm, 0.0393701 * (ST_summarystats(st_clip(met.rast, fgeo.dh_geofield_geom), 1, TRUE)).mean as precip_in from dh_timeseries_weather as met, field_data_dh_geofield as fgeo where met.featureid = :covid and met.varid = :met_varid and ( (met.tstime >= :start_epoch) OR (-1 = :start_epoch) ) and ( (met.tsendtime <= :end_epoch) OR (-1 = :end_epoch) ) and fgeo.entity_type = 'dh_feature' and fgeo.entity_id = :fid and (fgeo.dh_geofield_geom && met.bbox ) order by met.tsendtime ) to :'fname' WITH HEADER CSV;

rburghol commented 4 days ago

New tiled 16x16 with shorter name to see if that fixes wdm import.

met_scenario=nldas2rst
basin=RU4_6040_6030
scenario=subsheds
run with full on #72

Create a baseline scenario to compare it to:

scenario=CFBASE30Y20180615_vadeq

HARPgroup / model_meteorology

Do resample mode #87

Testing

Debugging