OpenCDSS / ArkDSS-Colors-of-Water

Colorado's Decision Support Systems (CDSS) ArkDSS Colors of Water Model Engine code
GNU General Public License v3.0
2 stars 5 forks source link

StateTL - utilize diversion records for node flows #18

Closed kelleythompson closed 2 years ago

kelleythompson commented 2 years ago

In TLAP and in the CoW tool up to this point, only telemetry values (raw hourly QCed by published daily) have been used to define flows at diverting and releasing structures (although diversion records are used to define reservoir releases that are modeled). In the lower Arkansas, the majority of diversion structures have telemetry; and Livingston defined average (dry/avg/wet) flows at the few nodes that do not have telemetry. However, the model could be enhanced by utilizing diversion records to define flows at the few points where there are diversion records but not telemetry, and to use the diversion records to further QC the flows at other locations. Although this may not change much in the Lower Arkansas, it will be more important for additional reaches where there is less telemetry.

kelleythompson commented 2 years ago

Daily diversion records from REST services available for all nodes (if available) were incorporated into the long term station data set and subsequently into the QC and filling routines. The following describes the current methodologies in detail.

With pulllongtermstationdata=1 in the control file, daily station data is collected from year 2000 through the current year. The multiple years of data is used to establish wet/dry/avg averages for every julien day for filling, but then the specific year of daily data is also used to QC hourly data for the year that is being run in the model. However, there are multiple sources of daily data, so the following order is used for the daily data:

  1. Daily structure “diversion” data from divrecday REST command (which includes both diversions and releases)
  2. Daily “published station” data from the surfacewaterday REST command
  3. Daily “telemetry” data from the telemetry day REST command

For daily “diversion” data, different records are collected for outflow and inflow nodes. For outflow nodes (ie headgates), the record with a water class number of 1+ wdid is collected using REST as the REST commands provide this record corresponding to the “Total (Diversion)”. This typically is identical to an XQ0 water class record, but some of structures even in WD17 have a “Total (Diversion)” record but do not have an XQ0 record. For inflow nodes that could be a reservoir, aug station, return station etc, all Type 7, Type L (release of dominion and control), or Type E (excess diversion) records are collected and summed by day to yield the total daily release amount. Although a Total (Release) record also exists for some structures, it appears that this records is often incomplete or missing for some inflow structures. Releases are almost always a summation of the Type 7 records, but some structures have occasional Type L or Type E records.

Some zero filling can occur to missing long term daily records primarily given the availability of other diversion records and settings. This is important to watch as development continues to ensure this works as desired. If filllongtermwithzero>=1, missing values are filled with zero if the structure had other usable diversion records during year. For many structures it appears that a lack of a record indicates there was no diversion or release, and this prevents the filling of those values with a trend, interpolation, or long term average based value in the filling routine. However, for the current year, this zeroing is limited to records before the last record date so the forecasted data are filled. For the previous year, this zeroing is limited to Nov1 in case diversion records for the current water year have not been published. If filllongtermwithzero=2, other years are also filled with zero if had at least one year with div record. This is parameter is currently set to 2 within the script, so watch in the future if these assumptions about days without diversion records are appropriate. Also of note, a table was added that is currently included with the inputdata spreadsheet that corrects the daily data value due to an erroneous diversion record value or water class type (ie Type 7/L/E) that does not end up working correctly with model processes. Currently there are 2 explicit corrections listed for WD17.

For the long-term station data, a daily flag was added that documents the source of each days data. Currently, the number of the individual diversion record collected via the REST command (i) is added to this flag value as a decimal number (i/100000) for testing and evaluation. 0 – blank/missing 2 – from telemetry REST 3 – from surface water REST 4 – single value from diversion record REST 5 – value summed from several diversion records (ie Type 7/L/E) 6 – diversion record is corrected using explicit diversion record correction 8 – missing value assigned zero because is within year that had diversion records listed for other days, for this to occur filllongtermwithzero>=1 9 - missing value assigned zero because other years had diversion records, for this to occur filllongtermwithzero=2

Hourly flags were added to describe the hourly QC and filled date for the run year that incorporate the above daily flag values. So the following QC set flag values often are added to the above values: 10 – no hourly day, filled with daily data 20 – hourly data replaced with daily data value; difference between mean hourly data and daily data exceeds threshold 2 30 – hourly data adjusted so mean of hourly data matched daily data value; difference between mean hourly data and daily data exceeds threshold 1 but not threshold 2 40 – day is missing some but not all of hourly data missing data replaced with mean of daily data, in this case the difference between mean of non-missing hourly data and daily data is less than threshold 1 (otherwise would be filled with flag 20 or 30 conditions) 50 – if daily blank but has hourly then hourly data blanked (will be filled); this will not occur if after end of last daily date of daily data

And the following filled set flag values are often added to the above QC flag values: 100 – just for gages, filled with regression fill, where length of the good data common between filling and filled stations is less than the regression fill window. Currently regression fill window is 28 days but useregrfillforgages=0 so this option is not used (don’t like results as much as other options) 200 – just for gages, filled with regression fill, where length of the good data common between filling and filled stations exceeds regression fill window and is therefore common data is limited to the size of the regression fill window. Currently regression fill window is 28 days but useregrfillforgages=0 so this option is not used (don’t like results as much as other options) 300 – missing value is interpolated between data before and after (to the left and right), gap distance is less than small gap value (currently 7 days) 400 – missing value is based on trend of available data before (to the left) of it as is missing data after (to the right) in the year and also there is no daily data (after / to right) within 30 days 500 - missing value is based on trend of available data after (to the right) of it as is missing data before (to the left) in the year and also there is no daily data (before / to left) within 30 days 600 – missing value based on average (wet/dry/avg) daily value; as gap distance to available data is greater than large gap value (currently 30 days) 700 – missing value based on weighting between trend (see 400/500) or interpolation (see 300) value and average (wet/dry/avg) (see 600) daily value based on actual gap to small gap (currently 7 days) and large gap (currently 30 days) distances, as gap distance to available data is more than small gap but less than large gap value. 800 – missing value based on average wet/dry/avg value from long term data, as all data missing for year and average wet/dry/avg value in inputdata is -999 900 – missing value based on average wet/dry/avg value specified in inputdata spreadsheet, as all data missing for year and average wet/dry/avg value in inputdata is not -999 1000 – missing value filled with zero; as no other methods filled the data (shouldn’t happen but just in case)

The following figures shows examples of hourly data filled using various sources; all of these are for year 2018. Filled entirely from diversion records as no telemetry (in 2018) at structure: issue18_fig1 Complex ditch with various of sources and correction: issue18_fig2 Gage filled with several sources: issue18_fig3 Big correction of telemetry based on diversion records: issue18_fig4 Complex reservoir outfall: issue18_fig5 Currently filled with wet/dry/avg specified in input data (this will get replaced with actual flow out of Timpas though during run): issue18_fig6