USGS-R / regional-hydrologic-forcings-ml

Repo for machine learning models for regional prediction of hydrologic forcing functions. Includes probabilistic seasonal high flow regions for CONUS, and prediction of high flow metrics for selected regions.
Creative Commons Zero v1.0 Universal
0 stars 4 forks source link

Investigate several warnings that appear when the pipeline is built #44

Open jds485 opened 2 years ago

jds485 commented 2 years ago

unique(tar_meta(fields = 'warnings')[!is.na(tar_meta(fields = 'warnings')$warnings),]$warnings) [1] "One or more parsing issues, see problems for details"
[2] "The following named parsers dont match the column names discharge, discharge_cd" [3] "Removed 13 rows containing missing values geom_point."

jds485 commented 2 years ago

[1] is caused by NA dates in the peak flow timeseries. We're dropping all NA dates, so these are not an issue. I wonder if we can instead drop the row of data only if the year is NA with something like: data_out <- readNWISpeak(site_num, startDate, endDate, convertType = FALSE) data_out <- data_out[which(!is.na(substr(data_out$peak_dt, start = 1, stop = 4))), ]

This may not be compatible with other functions because the date is 'YYYY-00-00'.

[2] These are caused by forgetting to suppress expected warnings in handing of 4 sites with odd column names. This is addressed in #50

[3] this is caused by plot_trend_summary. Here are more warnings given by that function:

Warning messages:
1: Removed 2843 rows containing non-finite values (stat_smooth). 
2: Removed 2843 rows containing missing values (geom_point). 
3: Removed 13 rows containing non-finite values (stat_smooth). 
4: Removed 13 rows containing missing values (geom_point). 
5: Removed 13 rows containing non-finite values (stat_smooth). 
6: Removed 13 rows containing missing values (geom_point).