Open smalers opened 2 years ago
While implementing TSTool features to use CDSS web services for historical station data, I have the following questions and observations. I will update this issue as I work on the implementation. Given that I did not have budget to implement features when web services first came out, the time may have passed for the State to respond to feedback. However, at a minimum, maybe web service documentation could be enhanced to explain nuances and maybe the State has its own issue list internally that could benefit from some of the following. I expect to have features working in the next couple of days and will make decisions along the way. As usual, I will update the TSTool datastore appendix documentation to reflect the implementation.
Feedback on MeasType
:
surfacewaterstationdatatypes
and climatesttiondatatypes
services both return some of the same "measType". Should Streamflow
be returned for surface water stations and the others climate stations? For now I will implement the code as best I can. Note also that some of the data types such as MaxTemp
has spaces at the end, resulting in redundant data types.Climate station data types:
"measType": "Evap",
"measType": "FrostDate",
"measType": "MaxTemp ",
"measType": "MaxTemp",
"measType": "MeanTemp",
"measType": "MinTemp ",
"measType": "MinTemp",
"measType": "Precip",
"measType": "Snow",
"measType": "SnowDepth",
"measType": "SnowSWE",
"measType": "Solar",
"measType": "Streamflow",
"measType": "VP",
"measType": "Wind",
Surface water station data types:
"measType": "MaxTemp",
"measType": "MeanTemp",
"measType": "MinTemp",
"measType": "Solar",
"measType": "Streamflow",
"measType": "VP",
"measType": "Wind",
Feedback on dateFormat
:
dateFormat
? Also, for data that are dates, why not make the default format the date. There can be confusion as to whether midnight (which is time zero of a day) corresponds to a computed value over the previous day. Using date format would clear this up.dateFormat
should not change modified
since that is always a time? In particular if someone is requesting monthly or annual data for an automated data dump, it would be good to check the actual modification time, not the month or year from the modification time.Feedback on measCount
:
Feedback on value
:
value
is always a valid number and consuming code does not need to deal with a missing value indicator such as null
, NaN
(JSON does not actually use), or -999
(which can be valid valid value for some time series).Feedback on climate station frost dates:
siteId
or any other human-facing identifier. This can be confusing when more than one station's data are read.Feedback on surfacewaterstationdatatypes
service:
abbrev
.Feedback on climatestations
:
latitude
when other services use latdecdeg
longitude
when other services use longdecdeg
Feedback on surfacewaterstations
:
latitude
when other services use latdecdeg
longitude
when other services use longdecdeg
measUnit
included when that is specific to data types?It is not possible to fully rely on web services to provide a clean list of data types for use in TSTool and therefore some manual handling in the code is necessary. Some design decisions are as follows:
measType
and then an interval value computed as a statistic, for example for monthly streamflow measType=Streamflow
and a data value of minQCfs
. In the TSTool world, this combines main data type, statistic, and data units. To make this work in TSTool, I am going to use a data type of Streamflow-Min
and set the units as appropriate. See the example below.Evap
are minimum and maximum values for a month the minimum and maximum daily values in the month? I think yes.MaxTemp
, it is really daily maximum temperature, so a monthly time series for minValue
would be the monthly minimum of daily maximum temperature, right?minQCfs
the minimum of instantaneous streamflow, or minimum of the average daily flows? This is important for peak flow analysis.measType
and `value` handling in web services would be helpful.**minQCfs
. All of this is handled in TSTool but it is not consistent in the services.After implementing the web services, automated tests, and documentation, here are the remaining questions and comments that I have for the State. Other feedback above is still relevant, such as the need for improved documentation for some of the details. The automated tests compare HydroBase database datastore and REST web services datastore results because otherwise it is difficult to know if the REST web service values are as expected.
The following is an example of a Linux command that can be used to check for distinct measType
in the data types results:
grep measType /C/Users/sam/Downloads/download-climate-datatypes.json | sort -u
Streamflow
- is this a data issue? From Doug: [Fixed] I confirmed.Streamflow
, which is good. measType
to use in data type choice.
This seems to be a bit slow for surface water stations.
It would be useful to have a service for the unique list of measType
each for
climate stations and surface water stations to increase performance. From Doug: [Fixed]. Are you using GET api/v2/climatedata/climatestationsdatatypes and GET api/v2/surfacewater/surfacewaterstationdatatypes? They return a unique list of measType for each station. Or are you asking for 2 new services that just return a unique list of measType for ClimateStation / Surface Water Stations regardless of station? I have been doing a query like Doug indicated and now that the response is fast, I retract my request for a new service.measUnit
on frost dates, such as to day
? Currently the units are blank. From Doug: [Done] I verified.dataSource
has CoAgMet
and COAGMET
. From Doug: [Fixed] I verified.
I have not checked to see whether the same station is loaded with both but it would be good to use only one abbreviation.dataSource
as if an abbreviation and does not have dataSourceAbbrev
.
This is inconsistent with telemetry stations. I realize that it may be difficult to change at this point. From Doug: [No Action]cfs
to af
units is different.
To get the automated test to run without warnings I had to multiply the REST web service values by 1.00025216
.
I recommend that the State confirm that the conversion factor is the same for HydroBase database
calculations and web services. From Doug: [Doug] Found that Hydrobase was using 1.984 instead of 1.9835. Have changed views in HydroBase that will take effect in the next database snapshot release. Previous releases will be slightly off. This change will likely break automated tests that I run for TSTool (when I get a new HydroBase) and will change CDSS model input slightly for larger streamflow numbers due to the number of digits used. This type of change is hard on automated testing because it requires running different tests on different versions of the database, which takes resources to configure. This is just the way it is.Evap
, Precip
, and Solar
.
This means that TSTool time series identifiers that only use the data type would be ambiguous for Day
interval.
One solution is to use a location type of climate
in the time series identifier as in
climate:StationId.DataType.DataSource.DataInterval
.
However, It appears that climate stations do not have measType
values that overlaps telemetry station.
Therefore, I am going to try to avoid putting climate:
in front of time series identifiers because TSID should be unique.
This will have to change if ambiguity results in the future. From Doug: [No Action].
This is mainly a TSTool convention and the State can comment on documentation when I have that done.dateFormat
does not seem to work. I tried using dateFormat=dateOnly
since frost dates have precision to day. From Doug: https://dwr.state.co.us/Rest/GET/api/v2/climatedata/climatestationfrostdates/?dateFormat=dateOnly&stationNum=1 is formatting results correctly.
My bad on this. It does work. My feedback similar to a previous comment is that in such a case it
makes sense that the default date format would just be date (no time) since time is not used.Total
monthly values do not seem to be computed for the following measType
: MinTemp
, MaxTemp
, MeanTemp
,
SnowDepth
, SnowSWE
, Solar
, VP
, Wind
. This makes sense for some because the values are instantaneous.
However, monthly temperature total such as total of MeanTemp
could be used for
degree days when evaluating climate change.
Total solar radiation for a month may also be useful for consumptive use modeling but I'd have to research the units more.
Total wind run is useful to understand how windy a location is and can be used in consumptive use modeling.
Kelley Thompson might have opinions on such things. I have the queries working and now need to write automated tests
so I'll know soon how the web services compare to HydroBase database queries. From Doug: [Doug] These are new summations that previously have not been produced. As mentioned above, this potential request should be discussed with the CWCB/DWR Team to see if it is something we want to provide.Overall, my questions have been answered and issues addressed. Remaining issues are related to improving documentation, which the State can do during normal maintenance. Inconsistencies in the API are unfortunately baked in at this point because people are using the API. Perhaps inconsistencies can be addressed in the future if a new version is released. The new time series that I have suggested can be discussed with the CDSS group. Enabling these will require changes to TSTool since I am currently filtering out the statistics that have all nulls in the results so as to not confuse users.
I will keep this issue open until a TSTool release is made because I may find some additional issues as I finalize the tests.
The following are issues found during testing. I added many tests and believe the following are unresolved. Links are provided to the current test command file because the content below may be out of date. I will publish a software release when done putting together the tests so that the State can download an run themselves. All testing was done with HydroBase_20210322
. These issues may be indicative of similar issues at other stations. Although it would be possible to compare all stations in HydroBase with all stations in web services, that comparison is beyond the scope of software automated tests.
MeanTemp-Avg.Month
Test failsThe following TSTool command file shows differences:
# Test reading an MeanTemp-Avg month interval time series from ColoradoHydroBaseRest web service using a TSID.
# - Compare the resulting time series with that retrieved from HydroBase
# - allow some missing based on database
StartLog(LogFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_MeanTemp-Avg_Month.TSTool.log")
RemoveFile(InputFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_MeanTemp-Avg_Month_out.dv",IfNotFound=Ignore)
# Read the same time series from the web service and HydroBase
SetInputPeriod(InputStart="2015-01",InputEnd="2018-12")
# FTC01 - FORT COLLINS
FTC01.CoAgMet.MeanTemp-Avg.Month~HydroBaseWeb
FTC01.CoAgMet.TempMean.Month~HydroBase
# Make sure that enough data are available in the test data, and some missing
CheckTimeSeriesStatistic(Statistic="NonmissingCount",CheckCriteria="<=",CheckValue1=10,IfCriteriaMet=Warn)
CheckTimeSeriesStatistic(Statistic="MissingCount",CheckCriteria=">",CheckValue1=0,IfCriteriaMet=Warn)
WriteDateValue(TSList=LastMatchingTSID,TSID="FTC01.CoAgMet.MeanTemp-Avg.Month",OutputFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_MeanTemp-Avg_Month_out.dv",Precision=2)
CompareTimeSeries(Tolerance=".01",IfDifferent=Warn)
MeanTemp-Max.Month
Test failsI'm not sure why the following command file fails. I may not be understanding the contents of the data. Monthly statistics computed on daily statistic is confusing.
# Test reading an MeanTemp-Max month interval time series from ColoradoHydroBaseRest web service using a TSID.
# - Compare the resulting time series with that retrieved from HydroBase
# - allow some missing based on database
StartLog(LogFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_MeanTemp-Max_Month.TSTool.log")
RemoveFile(InputFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_MeanTemp-Max_Month_out.dv",IfNotFound=Ignore)
# Read the same time series from the web service and HydroBase
SetInputPeriod(InputStart="2015-01",InputEnd="2018-12")
# FTC01 - FORT COLLINS
FTC01.CoAgMet.MeanTemp-Max.Month~HydroBaseWeb
FTC01.CoAgMet.TempMeanMax.Month~HydroBase
# Make sure that enough data are available in the test data, and some missing
CheckTimeSeriesStatistic(Statistic="NonmissingCount",CheckCriteria="<=",CheckValue1=10,IfCriteriaMet=Warn)
CheckTimeSeriesStatistic(Statistic="MissingCount",CheckCriteria=">",CheckValue1=0,IfCriteriaMet=Warn)
WriteDateValue(TSList=LastMatchingTSID,TSID="FTC01.CoAgMet.MeanTemp-Max.Month",OutputFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_MeanTemp-Max_Month_out.dv",Precision=2)
CompareTimeSeries(Tolerance=".01",IfDifferent=Warn)
MeanTemp-Min.Month
Test failsI'm not sure why the following command file fails. I may not be understanding the contents of the data. Monthly statistics computed on daily statistic is confusing.
# Test reading an MeanTemp-Min month interval time series from ColoradoHydroBaseRest web service using a TSID.
# - Compare the resulting time series with that retrieved from HydroBase
# - allow some missing based on database
StartLog(LogFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_MeanTemp-Min_Month.TSTool.log")
RemoveFile(InputFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_MeanTemp-Min_Month_out.dv",IfNotFound=Ignore)
# Read the same time series from the web service and HydroBase
SetInputPeriod(InputStart="2015-01",InputEnd="2018-12")
# FTC01 - FORT COLLINS
FTC01.CoAgMet.MeanTemp-Min.Month~HydroBaseWeb
FTC01.CoAgMet.TempMeanMin.Month~HydroBase
# Make sure that enough data are available in the test data, and some missing
CheckTimeSeriesStatistic(Statistic="NonmissingCount",CheckCriteria="<=",CheckValue1=10,IfCriteriaMet=Warn)
CheckTimeSeriesStatistic(Statistic="MissingCount",CheckCriteria=">",CheckValue1=0,IfCriteriaMet=Warn)
WriteDateValue(TSList=LastMatchingTSID,TSID="FTC01.CoAgMet.MeanTemp-Min.Month",OutputFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_MeanTemp-Min_Month_out.dv",Precision=2)
CompareTimeSeries(Tolerance=".01",IfDifferent=Warn)
Snow.Day
Test FailsThe following test fails. The HydroBase database has a much shorter period and values are slightly different, although it has more recent data and web services does not.
# Test reading an Snow day interval time series from ColoradoHydroBaseRest web service using a TSID.
# - Compare the resulting time series with that retrieved from HydroBase
# - allow a high number of missing based on database, due to winter, etc.
StartLog(LogFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_Snow_Day.TSTool.log")
RemoveFile(InputFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_Snow_Day_out.dv",IfNotFound=Ignore)
# Read the same time series from the web service and HydroBase
SetInputPeriod(InputStart="1949-04-06",InputEnd="2018-12-31")
# USC00053005 - FORT COLLINS
USC00053005.NOAA.Snow.Day~HydroBaseWeb
USC00053005.NOAA.Snow.Day~HydroBase
# Make sure that enough data are available in the test data, and some missing
CheckTimeSeriesStatistic(Statistic="NonmissingCount",CheckCriteria="<=",CheckValue1=10,IfCriteriaMet=Warn)
CheckTimeSeriesStatistic(Statistic="MissingCount",CheckCriteria=">",CheckValue1=11501,IfCriteriaMet=Warn)
WriteDateValue(TSList=LastMatchingTSID,TSID="USC00053005.NOAA.Snow.Day",OutputFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_Snow_Day_out.dv",Precision=2)
CompareTimeSeries(Tolerance=".01",IfDifferent=Warn)
Snow-Max.Month
Test FailsThe Snow-Total.Month
test also fails.
The following test fails. The HydroBase database and web service values for last part of the time series are different. I may just need to use a newer version of HydroBase that has updated data.
# Test reading a Snow-Max month interval time series from ColoradoHydroBaseRest web service using a TSID.
# - compare the resulting time series with that retrieved from HydroBase
# - HydroBase does not include the monthly statistic so compute from daily and then compare
StartLog(LogFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_Snow-Max_Month.TSTool.log")
RemoveFile(InputFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_Snow-Max_Month_out.dv",IfNotFound=Ignore)
# Read the same time series from the web service and HydroBase
SetInputPeriod(InputStart="2014-08",InputEnd="2018-05")
# USC00053005 - FORT COLLINS
USC00053005.NOAA.Snow-Max.Month~HydroBaseWeb
USC00053005.NOAA.Snow.Day~HydroBase
SetInputPeriod(InputStart="2014-08-01",InputEnd="2018-05-31")
NewStatisticMonthTimeSeries(TSID="USC00053005.NOAA.Snow.Day",Alias="USC00053005-HydroBase-Month",NewTSID="USC00053005..Snow-Max.Month",Statistic=Max)
# Make sure that enough data are available in the test data, and some missing.
CheckTimeSeriesStatistic(Statistic="NonmissingCount",CheckCriteria="<=",CheckValue1=10,IfCriteriaMet=Warn)
CheckTimeSeriesStatistic(TSList=AllMatchingTSID,TSID="*Month",Statistic="MissingCount",CheckCriteria=">",CheckValue1=0,IfCriteriaMet=Warn)
WriteDateValue(TSList=LastMatchingTSID,TSID="USC00053005.NOAA.Snow-Max.Month",OutputFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_Snow-Max_Day_out.dv",Precision=2)
CompareTimeSeries(TSID1="USC00053005.NOAA.Snow-Max.Month",TSID2="USC00053005-HydroBase-Month",Tolerance=".01",IfDifferent=Warn)
SnowDepth.Day
Test failsThe following test fails. The patterns are similar but numbers are different. Do I need an updated HydroBase?
# Test reading a SnowDepth day interval time series from ColoradoHydroBaseRest web service using a TSID.
# - compare the resulting time series with that retrieved from HydroBase
# - allow a high number of missing based on database
StartLog(LogFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_SnowDepth_Day.TSTool.log")
RemoveFile(InputFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_SnowDepth_Day_out.dv",IfNotFound=Ignore)
# Read the same time series from the web service and HydroBase
SetInputPeriod(InputStart="1949-04-06",InputEnd="2018-12-31")
# USC00053005 - FORT COLLINS
USC00053005.NOAA.SnowDepth.Day~HydroBaseWeb
USC00053005.NOAA.SnowCourseDepth.Day~HydroBase
# Make sure that enough data are available in the test data, and some missing
CheckTimeSeriesStatistic(Statistic="NonmissingCount",CheckCriteria="<=",CheckValue1=10,IfCriteriaMet=Warn)
CheckTimeSeriesStatistic(Statistic="MissingCount",CheckCriteria=">",CheckValue1=11501,IfCriteriaMet=Warn)
WriteDateValue(TSList=LastMatchingTSID,TSID="USC00053005.NOAA.SnowDepth.Day",OutputFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_SnowDepth_Day_out.dv",Precision=2)
CompareTimeSeries(Tolerance=".01",IfDifferent=Warn)
SnowDepth-Avg.Month
Test FailsThe following test fails, likely because the daily data test fails.
# Test reading a SnowDepth-Avg month interval time series from ColoradoHydroBaseRest web service using a TSID.
# - compare the resulting time series with that retrieved from HydroBase
# - HydroBase does not include the monthly statistic so compute from daily and then compare
StartLog(LogFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_SnowDepth-Avg_Month.TSTool.log")
RemoveFile(InputFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_SnowDepth-Avg_Month_out.dv",IfNotFound=Ignore)
# Read the same time series from the web service and HydroBase.
SetInputPeriod(InputStart="2014-08",InputEnd="2018-05")
# USC00053005 - FORT COLLINS
USC00053005.NOAA.SnowDepth-Avg.Month~HydroBaseWeb
USC00053005.NOAA.SnowCourseDepth.Day~HydroBase
SetInputPeriod(InputStart="2014-08-01",InputEnd="2018-05-31")
NewStatisticMonthTimeSeries(TSID="USC00053005.NOAA.SnowCourseDepth.Day",Alias="USC00053005-HydroBase-Month",NewTSID="USC00053005..SnowCourseDepth-Avg.Month",Statistic=Mean)
# Make sure that enough data are available in the test data, and some missing.
CheckTimeSeriesStatistic(Statistic="NonmissingCount",CheckCriteria="<=",CheckValue1=10,IfCriteriaMet=Warn)
CheckTimeSeriesStatistic(TSList=AllMatchingTSID,TSID="*Month",Statistic="MissingCount",CheckCriteria=">",CheckValue1=0,IfCriteriaMet=Warn)
WriteDateValue(TSList=LastMatchingTSID,TSID="USC00053005.NOAA.SnowDepth-Avg.Month",OutputFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_SnowDepth-Avg_Day_out.dv",Precision=2)
CompareTimeSeries(TSID1="USC00053005.NOAA.SnowDepth-Avg.Month",TSID2="USC00053005-HydroBase-Month",Tolerance=".01",IfDifferent=Warn)
SnowDepth-Max.Month
Test FailsThe following test fails, likely because the daily data test fails.
Note that the minimum test passes but that is because all of the values are zero.
# Test reading a SnowDepth-Max month interval time series from ColoradoHydroBaseRest web service using a TSID.
# - compare the resulting time series with that retrieved from HydroBase
# - HydroBase does not include the monthly statistic so compute from daily and then compare
StartLog(LogFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_SnowDepth-Max_Month.TSTool.log")
RemoveFile(InputFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_SnowDepth-Max_Month_out.dv",IfNotFound=Ignore)
# Read the same time series from the web service and HydroBase
SetInputPeriod(InputStart="2014-08",InputEnd="2018-05")
# USC00053005 - FORT COLLINS
USC00053005.NOAA.SnowDepth-Max.Month~HydroBaseWeb
USC00053005.NOAA.SnowCourseDepth.Day~HydroBase
SetInputPeriod(InputStart="2014-08-01",InputEnd="2018-05-31")
NewStatisticMonthTimeSeries(TSID="USC00053005.NOAA.SnowCourseDepth.Day",Alias="USC00053005-HydroBase-Month",NewTSID="USC00053005..SnowCourseDepth-Max.Month",Statistic=Max)
# Make sure that enough data are available in the test data, and some missing.
CheckTimeSeriesStatistic(Statistic="NonmissingCount",CheckCriteria="<=",CheckValue1=10,IfCriteriaMet=Warn)
CheckTimeSeriesStatistic(TSList=AllMatchingTSID,TSID="*Month",Statistic="MissingCount",CheckCriteria=">",CheckValue1=0,IfCriteriaMet=Warn)
WriteDateValue(TSList=LastMatchingTSID,TSID="USC00053005.NOAA.SnowDepth-Max.Month",OutputFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_SnowDepth-Max_Day_out.dv",Precision=2)
CompareTimeSeries(TSID1="USC00053005.NOAA.SnowDepth-Max.Month",TSID2="USC00053005-HydroBase-Month",Tolerance=".01",IfDifferent=Warn)
Solar.Day
Test fails.The following test fails.
The HydroBase database and web service values match during the start and end of the period but the middle is totally different. This is odd because other tests for statistics based on the daily data pass.
# Test reading a Solar day interval time series from ColoradoHydroBaseRest web service using a TSID.
# - compare the resulting time series with that retrieved from HydroBase
# - allow a high number of missing based on database
StartLog(LogFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_Solar_Day.TSTool.log")
RemoveFile(InputFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_Solar_Day_out.dv",IfNotFound=Ignore)
# Read the same time series from the web service and HydroBase
SetInputPeriod(InputStart="2015-01-01",InputEnd="2020-07-20")
# USC00053005 - FORT COLLINS
FCL01.CoAgMet.Solar.Day~HydroBaseWeb
FCL01.CoAgMet.Solar.Day~HydroBase
# Make sure that enough data are available in the test data, and some missing
CheckTimeSeriesStatistic(Statistic="NonmissingCount",CheckCriteria="<=",CheckValue1=10,IfCriteriaMet=Warn)
CheckTimeSeriesStatistic(Statistic="MissingCount",CheckCriteria=">",CheckValue1=153,IfCriteriaMet=Warn)
WriteDateValue(TSList=LastMatchingTSID,TSID="FCL01.CoAgMet.Solar.Day",OutputFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_Solar_Day_out.dv",Precision=2)
CompareTimeSeries(Tolerance=".1",IfDifferent=Warn)
Wind.Day
Test failsThe following test fails. The HydroBase database and web service values match during the start and end of the period but the middle is totally different. This is odd because other tests for statistics based on the daily data pass.
# Test reading a Wind day interval time series from ColoradoHydroBaseRest web service using a TSID.
# - Compare the resulting time series with that retrieved from HydroBase
# - allow a high number of missing based on database, due to winter, etc.
StartLog(LogFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_Wind_Day.TSTool.log")
RemoveFile(InputFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_Wind_Day_out.dv",IfNotFound=Ignore)
# Read the same time series from the web service and HydroBase
SetInputPeriod(InputStart="2015-01-01",InputEnd="2020-07-20")
# USC00053005 - FORT COLLINS
FCL01.CoAgMet.Wind.Day~HydroBaseWeb
FCL01.CoAgMet.Wind.Day~HydroBase
# Make sure that enough data are available in the test data, and some missing
CheckTimeSeriesStatistic(Statistic="NonmissingCount",CheckCriteria="<=",CheckValue1=10,IfCriteriaMet=Warn)
CheckTimeSeriesStatistic(Statistic="MissingCount",CheckCriteria=">",CheckValue1=153,IfCriteriaMet=Warn)
WriteDateValue(TSList=LastMatchingTSID,TSID="FCL01.CoAgMet.Wind.Day",OutputFile="Results/Test_TSID_ColoradoHydroBaseRest_CompareHydroBase_Wind_Day_out.dv",Precision=2)
CompareTimeSeries(Tolerance=".1",IfDifferent=Warn)
Doug Stenzel provided a new HydroBase snapshot dated 20220330, which includes some data load issues and other changes to correct issues identified above. Based on this, I have been fixing tests so that the software checks out. Below are things that I had to deal with.
The issues with snow-related data tests failing have been resolved. I did have to make some software changes as described below.
Previously, the database design used integer for vw_CDSS_SnowCrse.depth
and now uses a double. This was causing issues because in the TSTool HydroBase code null
(missing) database values when treated as an integer use the Java Integer.MIN_VALUE
, which has a value -2147483648
. Floating point missing values use Double.NaN
. Underlying code was not aware of the database type change, which ended up resulting in -2147483648
being used in the time series. It is difficult to know when such situations arise without release notes for the database or automated tests that point out the issues, and the problem may have been present for a while. In this case, I changed the data value in the object to a double and handle casting from integer for older databases. There is potential that this will result in minor roundoff issues but probably not a big deal.
Note that vw_CDSS_SnowCrse.day
was also changed from text to integer at some point. Presumably the data load at some point used text for a reason? I changed to integer in the code, which makes more sense. If the day is somehow null in the database, then it may show up later as -2147483648
and should generate an error. There may be an issue on older databases but I tested with recent versions and it works ok. People should not be using old HydroBase databases for snow data since that data can be found online.
The issues with streamflow-related data tests failing have been resolved. The conversion factor from daily average flow to monthly and yearly volume now seems to be the same for HydoBase database and web services.
The issues with wind-related data tests failing have been resolved. Data issues must have been resolved.
Updates to the HydroBase have resolved a number of data issues with temperature data but it would be great to confirm the final result of work. The main confusion seems to be about monthly mean temperatures. Doug Stenzel provided the following examples of temperature stations:
Here are 2 temperature stations
USC00051401 (CASTLE ROCK)
Min/Max only 1893 - 2022
BRL02 (BURLINGTON SOUTH (#2), 6 MI SE BURLINGTON)
Min/Max from 1992 - 2022
Mean from 2015-2022
*Monthly data would use Min/Max up to 2015 and then Mean from then on.
For BRL02 I think he meant Min/Max from 1992 - 2014, and then Mean from 1992 - 2022. I put together a TSTool command file for this station to understand daily mean data, which resulted in the following graph (note the database mean and calculated mean are slightly different on the right side):
Based on some other testing, it seems that HydroBase contains daily mean temperature only if it was provided by the data source. For monthly mean time series, such as MeanTemp-Avg.Month
, MeanTemp-Min.Month
, and MeanTemp-Max.Month
, I have the following questions:
MeanTemp.Day
is not provided from an original source, do HydroBase and web services estimate as (Min + Max)/2?
MeanTemp.Day
?MeanTemp-*.Month
statistics?MeanTemp-Avg.Month
in TSTool but it seems that this statistic is not calculated for web services?I have released TSTool 14.2.2 with the above cleaned up tests and documentation and am moving on. The State can clarify if they have the time and energy to do so. I'll leave this open.
The historical web services need to be implemented for surface water and climate stations.