HARPgroup / model_meteorology

0 stars 0 forks source link

NLDAS Met data Download and Data Model #34

Closed rburghol closed 4 months ago

rburghol commented 2 years ago

Data Model

stash in vahydro dir for import

cd ../files/vahydro/ sftp dbase2 get /tmp/cbp6_met_coverage.wkt bye cd /var/www/html/d.dh drush scr modules/dh/src/import_features.php /var/www/html/files/vahydro/cbp6_met_coverage.wkt


- Other geom options:
   - p5 land segs: `select st_astext(st_multi(st_extent(dh_geofield_geom))) from dh_feature as a left outer join field_data_dh_geofield as b on (a.hydroid = b.entity_id and b.entity_type = 'dh_feature') where a.ftype = 'cbp532_landseg';`
      - `MULTIPOLYGON(((-83.6754134197279 35.8433695994306,-83.6754134197279 44.0969588250423,-74.1646796092988 44.0969588250423,-74.1646796092988 35.8433695994306,-83.6754134197279 35.8433695994306)))`
   - p6 land segs: `select st_astext(st_multi(st_extent(dh_geofield_geom))) from dh_feature as a left outer join field_data_dh_geofield as b on (a.hydroid = b.entity_id and b.entity_type = 'dh_feature') where a.ftype = 'cbp6_landseg';`
      - `MULTIPOLYGON(((-81.0144924130837 36.5503513152315,-81.0144924130837 44.0969678458827,-74.1646796294801 44.0969678458827,-74.1646796294801 36.5503513152315,-81.0144924130837 36.5503513152315)))`
   - p6 watersheds: `select st_astext(st_multi(st_extent(b.dh_geofield_geom))) from dh_feature as a left outer join field_data_dh_geofield as b on (a.hydroid = b.entity_id and b.entity_type = 'dh_feature') where a.bundle = 'watershed' and a.ftype like 'cbp%';`
      - `MULTIPOLYGON(((-83.6754131797072 36.1183484992513,-83.6754131797072 44.0969678458828,-74.1646796294801 44.0969678458828,-74.1646796294801 36.1183484992513,-83.6754131797072 36.1183484992513)))`
   - Importing coverages from file:
      - dh importable: https://github.com/HARPgroup/model_meteorology/blob/main/data/cbp_nldas2_coverage.wkt
      - do dh import: `drush scr modules/dh/src/import_features.php /opt/model/model_meteorology/data/cbp_nldas2_coverage.wkt`
- code/workflow to import grid data
   - Prototype: https://github.com/HARPgroup/dh_weather/blob/main/src/sh/fn.nldas2-import
- Storage requirements of raster data: `select relname, round(sum(8.0 * relpages / (1024.0 * 1024.0))) as size_gb, round(sum(8.0 * relpages / 1024)) as size_mb FROM pg_class where relname = 'dh_timeseries_weather' and relkind in ('r','t') group by relname order by relname;`
   - What is storage increase when we use clipped versus unclipped?
   - Try with temp table, insert many copies of a single file, or just with a single file?

### Raw data Files
- uses gdalinfo

. hspf_config # will load the NLDAS_ROOT directory var gdalinfo $NLDAS_ROOT/1984/001/NLDAS_FORA0125_H.A19840101.0000.002.grb Driver: GRIB/GRIdded Binary (.grb, .grb2) Files: /backup/meteorology/1984/001/NLDAS_FORA0125_H.A19840101.0000.002.grb Size is 464, 224 Coordinate System is: GEOGCRS["Coordinate System imported from GRIB file", DATUM["unnamed", ELLIPSOID["Sphere",6371200,0, LENGTHUNIT["metre",1, ID["EPSG",9001]]]], PRIMEM["Greenwich",0, ANGLEUNIT["degree",0.0174532925199433, ID["EPSG",9122]]], CS[ellipsoidal,2], AXIS["latitude",north, ORDER[1], ANGLEUNIT["degree",0.0174532925199433, ID["EPSG",9122]]], AXIS["longitude",east, ORDER[2], ANGLEUNIT["degree",0.0174532925199433, ID["EPSG",9122]]]] Data axis to CRS axis mapping: 2,1 Origin = (-125.000500000000002,53.000500000000002) Pixel Size = (0.125000000000000,-0.125000000000000) Corner Coordinates: Upper Left (-125.0005000, 53.0005000) (125d 0' 1.80"W, 53d 0' 1.80"N) Lower Left (-125.0005000, 25.0005000) (125d 0' 1.80"W, 25d 0' 1.80"N) Upper Right ( -67.0005000, 53.0005000) ( 67d 0' 1.80"W, 53d 0' 1.80"N) Lower Right ( -67.0005000, 25.0005000) ( 67d 0' 1.80"W, 25d 0' 1.80"N) Center ( -96.0005000, 39.0005000) ( 96d 0' 1.80"W, 39d 0' 1.80"N) Band 1 Block=464x1 Type=Float64, ColorInterp=Undefined Description = 2[m] HTGL (Specified height level above ground) NoData Value=9999 Metadata: GRIB_COMMENT=Temperature [C] GRIB_ELEMENT=TMP GRIB_FORECAST_SECONDS=0 sec GRIB_REF_TIME= 441763200 sec UTC GRIB_SHORT_NAME=2-HTGL GRIB_UNIT=[C] GRIB_VALID_TIME= 441763200 sec UTC Band 2 Block=464x1 Type=Float64, ColorInterp=Undefined Description = 2[m] HTGL (Specified height level above ground) NoData Value=9999 Metadata: GRIB_COMMENT=Specific humidity [kg/kg] GRIB_ELEMENT=SPFH GRIB_FORECAST_SECONDS=0 sec GRIB_REF_TIME= 441763200 sec UTC GRIB_SHORT_NAME=2-HTGL GRIB_UNIT=[kg/kg] GRIB_VALID_TIME= 441763200 sec UTC Band 3 Block=464x1 Type=Float64, ColorInterp=Undefined Description = 0[-] SFC (Ground or water surface) NoData Value=9999 Metadata: GRIB_COMMENT=Pressure [Pa] GRIB_ELEMENT=PRES GRIB_FORECAST_SECONDS=0 sec GRIB_REF_TIME= 441763200 sec UTC GRIB_SHORT_NAME=0-SFC GRIB_UNIT=[Pa] GRIB_VALID_TIME= 441763200 sec UTC Band 4 Block=464x1 Type=Float64, ColorInterp=Undefined Description = 10[m] HTGL (Specified height level above ground) NoData Value=9999 Metadata: GRIB_COMMENT=u-component of wind [m/s] GRIB_ELEMENT=UGRD GRIB_FORECAST_SECONDS=0 sec GRIB_REF_TIME= 441763200 sec UTC GRIB_SHORT_NAME=10-HTGL GRIB_UNIT=[m/s] GRIB_VALID_TIME= 441763200 sec UTC Band 5 Block=464x1 Type=Float64, ColorInterp=Undefined Description = 10[m] HTGL (Specified height level above ground) NoData Value=9999 Metadata: GRIB_COMMENT=v-component of wind [m/s] GRIB_ELEMENT=VGRD GRIB_FORECAST_SECONDS=0 sec GRIB_REF_TIME= 441763200 sec UTC GRIB_SHORT_NAME=10-HTGL GRIB_UNIT=[m/s] GRIB_VALID_TIME= 441763200 sec UTC Band 6 Block=464x1 Type=Float64, ColorInterp=Undefined Description = 0[-] SFC (Ground or water surface) NoData Value=9999 Metadata: GRIB_COMMENT=Downward longwave radiation flux [W/m^2] GRIB_ELEMENT=DLWRF GRIB_FORECAST_SECONDS=0 sec GRIB_REF_TIME= 441763200 sec UTC GRIB_SHORT_NAME=0-SFC GRIB_UNIT=[W/m^2] GRIB_VALID_TIME= 441763200 sec UTC Band 7 Block=464x1 Type=Float64, ColorInterp=Undefined Description = 0[-] SFC (Ground or water surface) NoData Value=9999 Metadata: GRIB_COMMENT=undefined [-] GRIB_ELEMENT=var153 GRIB_FORECAST_SECONDS=3600 sec GRIB_REF_TIME= 441759600 sec UTC GRIB_SHORT_NAME=0-SFC GRIB_UNIT=[-] GRIB_VALID_TIME= 441763200 sec UTC Band 8 Block=464x1 Type=Float64, ColorInterp=Undefined Description = 180-0[hPa] SPDY (Level between 2 levels at specified pressure difference from ground to level) NoData Value=9999 Metadata: GRIB_COMMENT=Convective available potential energy [J/kg] GRIB_ELEMENT=CAPE GRIB_FORECAST_SECONDS=0 sec GRIB_REF_TIME= 441763200 sec UTC GRIB_SHORT_NAME=180-0-SPDY GRIB_UNIT=[J/kg] GRIB_VALID_TIME= 441763200 sec UTC Band 9 Block=464x1 Type=Float64, ColorInterp=Undefined Description = 0[-] SFC (Ground or water surface) NoData Value=9999 Metadata: GRIB_COMMENT=Potential evaporation [kg/m^2] GRIB_ELEMENT=PEVAP GRIB_FORECAST_SECONDS=3600 sec GRIB_REF_TIME= 441759600 sec UTC GRIB_SHORT_NAME=0-SFC GRIB_UNIT=[kg/m^2] GRIB_VALID_TIME= 441763200 sec UTC Band 10 Block=464x1 Type=Float64, ColorInterp=Undefined Description = 0[-] SFC (Ground or water surface) NoData Value=9999 Metadata: GRIB_COMMENT=Total precipitation [kg/m^2] GRIB_ELEMENT=APCP GRIB_FORECAST_SECONDS=3600 sec GRIB_REF_TIME= 441759600 sec UTC GRIB_SHORT_NAME=0-SFC GRIB_UNIT=[kg/m^2] GRIB_VALID_TIME= 441763200 sec UTC Band 11 Block=464x1 Type=Float64, ColorInterp=Undefined Description = 0[-] SFC (Ground or water surface) NoData Value=9999 Metadata: GRIB_COMMENT=Downward shortwave radiation flux [W/m^2] GRIB_ELEMENT=DSWRF GRIB_FORECAST_SECONDS=0 sec GRIB_REF_TIME= 441763200 sec UTC GRIB_SHORT_NAME=0-SFC GRIB_UNIT=[W/m^2] GRIB_VALID_TIME= 441763200 sec UTC


### Setting up raster import from NLDAS2

cd /backup/meteorology/out/sql . hspf_config fname=$NLDAS_ROOT/1984/001/NLDAS_FORA0125_H.A19840101.0000.002.grb tifname="${fname}-4326.tif"

use -a to append, omit -a and it will create the table anew

create

raster2pgsql -t 1000x1000 $tifname tmp_nldas2 > tmp_nldas2.sql

append 24 copies

raster2pgsql -a -t 1000x1000 $tifname tmp_nldas2 > tmp_nldas2-test.sql

set year

thisdate="2023-01-01" coverage_hydrocode='cbp6_met_coverage' yr=date -d "$thisdate" +%Y mo=date -d "$thisdate" +%m da=date -d "$thisdate" +%d maskExtent="/backup/meteorology/cbp_extent.csv" jday=date -d "$thisdate" +%j ymd="$yr$mo$da" for i in 0 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 23; do hr2digit=printf %02d $i hr4digit="${hr2digit}00" fname="$NLDAS_ROOT/$yr/$jday/NLDAS_FORA0125_H.A${ymd}.${hr4digit}.002.grb" tifname="${fname}-4326.tif" tifname_clip="/tmp/nldas2_clip.tif" tstime=TZ="America/New_York" date -d "$thisdate ${hr2digit}:00:00" +'%s'

Reproject to 4326

gdalinfo gdalinfo PRISM_ppt_stable_4kmD2_20090407_bil.bil

gdalwarp "$fname" -t_srs EPSG:4326 "$tifname" rm /tmp/nldas2_clip.tif

Clipping the raster: Use gdalwarp to crop to the cutline maskExtent.csv, which is a csv of the CBP regions

gdalwarp -cutline $maskExtent -crop_to_cutline $tifname $tifname_clip

create

use -a to append, use -t and it try to drop an existing table then will create the table anew

raster2pgsql -d -t 1000x1000 $tifname_clip tmp_nldas2 > /tmp/tmp_nldas2.sql

import this raster into a temp table

cat /tmp/tmp_nldas2.sql | psql -h dbase2 drupal.dh03

now insert the raster into the timeseries table, with feature and variable information linked

inquery="insert into dh_timeseries_weather(tstime, varid, featureid, entity_type, rast)" inquery="$inquery select '$tstime', v.hydroid as varid, f.hydroid as featureid, 'dh_feature', met.rast" inquery="$inquery from dh_feature as f " inquery="$inquery left outer join dh_variabledefinition as v" inquery="$inquery on (v.varkey = 'nldas2_obs_hourly')" inquery="$inquery left outer join dh_timeseries_weather as w" inquery="$inquery on (f.hydroid = w.featureid and w.tstime = '${tstime}' and w.varid = v.hydroid) " inquery="$inquery left outer join tmp_nldas2 as met" inquery="$inquery on (1 = 1)" inquery="$inquery WHERE w.tid is null" inquery="$inquery AND f.hydrocode = '$coverage_hydrocode' " echo $inquery |psql -h dbase2 drupal.dh03 done


#### Using SLURM and meta-model to download and import

metsrc="nldas2" yr=2023 doy=date -d "${yr}-12-31" +%j i=0 while [ $i -lt $doy ]; do thisdate=date -d "${yr}-01-01 +$i days" +%Y-%m-%d echo "Running: sbatch /opt/model/meta_model/run_model raster_met \"$thisdate\" $metsrc auto met" sbatch /opt/model/meta_model/run_model raster_met "$thisdate" $metsrc auto met i=$((i + 1)) done


#### Inventory Met data

select extract(year from to_timestamp(met.tstime)) as year, min(to_timestamp(met.tstime)) as start_date, max(to_timestamp(met.tstime)) as end_date, count(*) from ( select met.tstime, (ST_SummaryStatsAgg(met.rast, 1, TRUE)).mean as precip_in from dh_feature as mcov left outer join dh_variabledefinition as v on ( v.varkey = 'nldas2_obs_hourly' ) left outer join dh_timeseries_weather as met on ( mcov.hydroid = met.featureid and met.varid = v.hydroid and met.entity_type = 'dh_feature' ) where mcov.hydrocode = 'cbp6_met_coverage' met.rast is not null group by met.tstime ) as met group by extract(year from to_timestamp(met.tstime)) order by extract(year from to_timestamp(met.tstime));

select extract(year from to_timestamp(met.tstime)) as year, min(to_timestamp(met.tstime)) as start_date, max(to_timestamp(met.tstime)) as end_date, round(0.0393701 * sum(precip_in)::numeric,1) as precip_in from ( select met.tstime, (ST_SummaryStatsAgg(met.rast, 1, TRUE)).mean as precip_in from dh_feature as mcov left outer join dh_variabledefinition as v on ( v.varkey = 'nldas2_obs_hourly' ) left outer join dh_timeseries_weather as met on ( mcov.hydroid = met.featureid and met.varid = v.hydroid and met.entity_type = 'dh_feature' ) where mcov.hydrocode = 'cbp6_met_coverage' group by met.tstime ) as met group by extract(year from to_timestamp(met.tstime)) order by extract(year from to_timestamp(met.tstime));


#### Verify Raster Data
- Summary of precip over the entire raster

select c.featureid, to_timestamp(c.tstime) as obs_date, extract(month from to_timestamp(c.tstime)) as mo, (ST_summarystats(c.rast, 10, TRUE)).mean as precip_kgm3, 0.0393701 * (ST_summarystats(c.rast, 10, TRUE)).mean as precip_in from dh_feature as f left outer join dh_variabledefinition as v on (v.varkey = 'nldas2_obs_hourly') left outer join dh_timeseries_weather as c on ( f.hydroid = c.featureid and c.varid = v.hydroid) where f.hydrocode = 'cbp6_met_coverage' order by c.tstime;


- summary of precip over a specific feature
- Shenandoah gage at Strasburg `hydrocode = 'usgs_ws_01634000'`

select met.featureid, to_timestamp(met.tstime) as obs_date, extract(month from to_timestamp(met.tstime)) as mo, (ST_summarystats(st_clip(met.rast, fgeo.dh_geofield_geom), 10, TRUE)).mean as precip_kgm3, 0.0393701 * (ST_summarystats(st_clip(met.rast, fgeo.dh_geofield_geom), 10, TRUE)).mean as precip_in from dh_feature as f left outer join field_data_dh_geofield as fgeo on ( fgeo.entity_id = f.hydroid and fgeo.entity_type = 'dh_feature' ) left outer join dh_variabledefinition as v on ( v.varkey = 'nldas2_obs_hourly' ) left outer join dh_feature as mcov on ( mcov.hydrocode = 'cbp6_met_coverage' ) left outer join dh_timeseries_weather as met on ( mcov.hydroid = met.featureid and met.varid = v.hydroid ) where f.hydrocode = 'usgs_ws_01634000' order by met.tstime;


#### Estimate storage needs (not sue if this is very illustrative

select met.featureid, to_timestamp(met.tstime) as obs_date, extract(month from to_timestamp(met.tstime)) as mo, (ST_MemSize(st_clip(met.rast, fgeo.dh_geofield_geom))/1024)/1024 as size_mb, (ST_summarystats(st_clip(met.rast, fgeo.dh_geofield_geom), 10, TRUE)).mean as precip_kgm3, 0.0393701 * (ST_summarystats(st_clip(met.rast, fgeo.dh_geofield_geom), 10, TRUE)).mean as precip_in from dh_feature as f left outer join field_data_dh_geofield as fgeo on ( fgeo.entity_id = f.hydroid and fgeo.entity_type = 'dh_feature' ) left outer join dh_variabledefinition as v on (v.varkey = 'nldas2_obs_hourly') left outer join dh_feature as mcov on ( mcov.hydrocode = 'cbp6_met_coverage' ) left outer join dh_timeseries_weather as met on ( mcov.hydroid = met.featureid and met.varid = v.hydroid) where f.hydrocode = 'usgs_ws_01634000' order by met.tstime;

rburghol commented 5 months ago

New import script:

rburghol commented 5 months ago

Test import from https://github.com/HARPgroup/meta_model/issues/50

select met.featureid, to_timestamp(met.tstime) as obs_date,
  extract(month from to_timestamp(met.tstime)) as mo,
  (ST_summarystats(st_clip(met.rast, fgeo.dh_geofield_geom), 10, TRUE)).mean as precip_kgm3,
  0.0393701 * (ST_summarystats(st_clip(met.rast, fgeo.dh_geofield_geom), 10, TRUE)).mean as precip_in
from dh_feature as f
left outer join field_data_dh_geofield as fgeo
on (
  fgeo.entity_id = f.hydroid
  and fgeo.entity_type = 'dh_feature'
)
left outer join dh_variabledefinition as v
on (
  v.varkey = 'nldas2_obs_hourly'
)
left outer join dh_feature as mcov 
on (
  mcov.hydrocode = 'cbp6_met_coverage'
)
left outer join dh_timeseries_weather as met
on (
  mcov.hydroid = met.featureid and met.varid = v.hydroid 
  and extract(year from to_timestamp(met.tstime)) = 2021
)
where f.hydrocode = 'usgs_ws_01634000'
order by met.tstime;