Open rburghol opened 2 months ago
Here is the current function that I updated a bit tonight. This works for any PRISM, daymet, and NLDAS2 data, such as:
read.csv("http://deq1.bse.vt.edu:81/met/stormVol_prism/precip/usgs_ws_01613900-PRISM-all.csv")
When calculating a metric, you can run something like this to return the lowest 90-day flow:
summary_analytics(prism)$l90_precip_in
`library(tidyr) library(sqldf) library(zoo)
summary_analytics <- function(df){ df <- separate (data = df, col = obs_date, into = c("obs_date","obs_time"), sep = " ") df <- separate (data = df, col = obs_date, into = c("obs_year","obs_month","obs_day"), sep = "-")
yearly.summary <- sqldf( "SELECT obs_year, SUM(precip_in) AS total_precip FROM df GROUP BY obs_year" )
precip_annual_max_in <- max(yearly.summary$total_precip)
precip_annual_max_year <- yearly.summary$obs_year[which.max(yearly.summary$total_precip)]
precip_annual_mean_in <- mean(yearly.summary$total_precip)
precip_annual_min_in <- min(yearly.summary$total_precip[c(-nrow(yearly.summary),-1)]) precip_annual_min_year <- yearly.summary$obs_year[which.min (yearly.summary$total_precip[c(-nrow (yearly.summary),-1)])]
daily.summary <- sqldf( "SELECT obs_year, obs_month, obs_day, SUM(precip_in) AS total_precip FROM df GROUP BY obs_year, obs_month, obs_day" ) precip_daily_max_in <- max(daily.summary$total_precip)
if(length(unique(df$hr)) == 24){ precip_hourly_max_in <- max(df$precip_in) } else { precip_hourly_max_in <- NA }
l90_precip_in <- min(rollapply(daily.summary$total_precip, width = 90, FUN = mean, fill = NA, align = "right"), na.rm = TRUE)
metrics<- data.frame(precip_annual_max_in = precip_annual_max_in, precip_annual_max_year = precip_annual_max_year, precip_annual_mean_in = precip_annual_mean_in, precip_annual_min_in = precip_annual_min_in, precip_annual_min_year = precip_annual_min_year, precip_daily_max_in = precip_daily_max_in, precip_hourly_max_in = precip_hourly_max_in, l90_precip_in = l90_precip_in) }`
I can run through this in detail next week since I know this is a lot to look at. I'll be out of town tomorrow afternoon through Sunday and likely out of service, but I will do my best to respond before I leave or after I get back this weekend. In the meantime, I wanted to make sure this is stored here.
@nathanielf22 hey Nate, thanks a bunch for pushing this up before you headed out. Very much look forward to working with this, and congratulations on the progress.
10/25 meeting:
@nathanielf22 if you can post up the input variables that created that strange output, I will check on my machine. But, from my recollection, I think that we used the wrong coverage_ftype
, and in my testing with another coverage (details below), I got correct results. The ftype should be usgs_full_drainage
for those USGS gage watershed features.
scenario_name <- 'daymet2'
coverage_hydrocode <- 'usgs_ws_01673550'
coverage_bundle <- 'watershed'
coverage_ftype <- 'usgs_full_drainage'
model_version <- 'cbp-6.1'
met_file <- 'http://deq1.bse.vt.edu:81/met/daymet/precip/usgs_ws_01673550-daymet-all.csv'
@nathanielf22 -- I have one observation about the script as is. I tested it in full, and it works! The only thing you should update is the variable name summary
should be changed to something else. The reason being, there is an existing R function called summary()
, and when you assign the output of summary_analytics()
to summary
that function gets overwritten, and that is a fairly important function that is used a good deal. precip_summary
might be a good option :).
@rburghol @COBrogan -- The code we put together last Friday to integrate into the workflow has an issue with the daymet and NLDAS data and should be updated. when including the n>=365days limitation, daymet excludes certain years where it only has 364 days as shown below:
In addition, NLDAS has an issue because yearly summary creation happens before daily summaries, which I switched around. This data is for gage usgs_ws_01613900.
@nathanielf22 The daymet dataset will always have 365 days in each year. They always exclude December 31st on leap years (but include February 29th, which is an unusual set-up). But, remember that daymet was based in the GMT time zone. That means that each day in EST runs from 7AM to 7AM the next day. So, when grouping data by year, it matters if you are grouping by tstime
or tsendtime
.
In this case, we can trace through our routines and see that we are grouping data by tsendtime
(a confusing by-product of how PRISM operates and an assumption we made about daymet based on a daymet-PRISM regression). That means that leap years will always have 366 data points and the year after will have 364. To better demonstrate this, we are essentially beginning our year at 7 AM January 2nd and ending it on 7AM Dec 31. So, since a daymet leap year does not have Dec 31st, it must have 364 days.
My suggestion for now is to include years >= 364 days but let's discuss at our next meeting!
For reference, note that daymet has 365 days each year when grouping by tstime:
select count(ts.rast),extract(year from to_timestamp(ts.tstime))
from dh_timeseries_weather as ts
left join dh_variabledefinition var on ts.varid = var.hydroid
where var.varkey = 'daymet_mod_daily'
group by extract(year from to_timestamp(ts.tstime));
count | extract
-------+---------
365 | 1980
365 | 1981
365 | 1982
365 | 1983
365 | 1984
365 | 1985
365 | 1986
365 | 1987
365 | 1988
365 | 1989
365 | 1990
365 | 1991
Analysis Steps
RomFeature
to retrieve the coverage of interestom_model_object()
to retrieve or create a model property on the featureom_get_model_scenario()
to retrieve or create a scenario container property on the modelvahydro_post_metric_to_scenprop()
to save the metrics to the scenario container propom_vahydro_metric_grid()
to retrieve large amounts of data.stormVol_nldas2
, etc)jan, feb, mar, ...
as propname (perhaps organized below a container property namedmonthly_rsq
)Overview
vahydro_post_metric_to_scenprop()
which streamlines the addition of metrics to a model/scenario result summaryom_vahydro_metric_grid()
to retrieve large amounts of data.$NLDAS_ROOT/sqa_met $i met2date $lseg_ftype $model_version "$refresh_years"
raster_met/wdm/analyze
Data Model & REST
Data Model Outline
lm_simple
,stormVolPlot
,cbp-6.0
,cbp-6.1
, ...)stormVol_prism
,stormVol_nldas2_tiled
,simple_lm_nldas2
, ...)[scenario].con
in our meteorology config directory on the server, Ex: http://deq1.bse.vt.edu:81/p6/vadeq/config/control/met/stormVol_prism.conprecip_annual_max_in
precip_annual_max_year
precip_annual_mean_in
precip_annual_min_in
precip_annual_min_year
precip_hourly_max_in
precip_daily_max_in
l90_precip_in
suppressPackageStartupMessages(library(IHA))
group2
to calculate L90 (and a host of other things): https://github.com/HARPgroup/meta_model/blob/342bc955819efd7e0e1f25f3c8364102d2f99237/scripts/river/hsp_hydr_analysis.R#L261met_file
: this will point to a meteorology data file such as http://deq1.bse.vt.edu:81/met/stormVol_prism/precip/REST Find feature, Model and Scenario
RomFeature
to retrieve objecthydrocode
: unique identifier, ex:usgs_ws_01613900
,N51113
bundle
: general feature class, ex:watershed
,landunit
ftype
: specific feature type (source of data), ex:usgs_full_drainage
,cbp6_landseg
feature <- RomFeature$new(ds, list( hydrocode=coverage_hydrocode, ftype=coverage_ftype, bundle=coverage_bundle), TRUE)
RomFeature
om_model_object()
feature
: an object of classRomFeature
model_version
: model overall version, ex:pf-1.0
,cbp-6.1
model <- om_model_object(ds, feature, model_version)
om_get_model_scenario()
model
scenario_name
scenario <- om_get_model_scenario(ds, model, scenario_name)
vahydro_post_metric_to_scenprop()
vahydro_post_metric_to_scenprop(pid, varkey, propcode, propname, propvalue, ds)
pid
: scenario unique integer valuepid
varkey
: useom_class_Constant
for most decimal attributespropcode
:propname
: required, descriptive variale namepropvalue
: requiredds
: RomDatasourcevahydro_post_metric_to_scenprop(scenario$pid, 'om_class_Constant', NULL, 'num_records', numrecs, ds)
Detail View of REST Feature Query
Using om_vahydro_metric_grid to retrieve data