nationalparkservice / EnvironmentalSetting_Toolkit

Tools supporting the NPS IMD Environmental Setting protocol
Other
3 stars 2 forks source link

Valid rows missing from CSP3 - micro-drought metric #48

Closed llnelson closed 6 years ago

llnelson commented 6 years ago

Records with valid runs missing from output. Due to use of sapply() to request data and incorrect logic in getRunCounts() when decomposing nested column pcpn_in_run. Also, hiccuping may have occured due to erroneous check for countMissing element in ACIS response. Run requests do not return count missing since by definition a run cannot have a missing value.

Example affected uids: 67667, 77478, 79362, 69577

Debugging on 20180518 usinglapply() instead of sapply() results in a list of data frames with consistent structure that can be bound into a single data frame and passed to getRunCounts(). Edits made to getStationMetrics() and cleanNestedLists()although the latter is not used when producing metric CSP3.

Resulting code: metricSource <- lapply(climateStations, function(x){ getWxObservations( climateStations = x, climateParameters = cParam, sdate = sdate, edate = edate, duration = duration, interval = interval, reduceCodes = rCode, maxMissing = 10, metric = metric ) }) metricSourceCombo <- do.call(rbind, metricSource) metricSourceComboCleaned <- metricSourceCombo[metricSourceCombo$uid != "no data available",] metricData <- getRunCounts(rawCounts = metricSourceComboCleaned, runLength = 7, metric = metric) outputMetricFile(metricData, metric, filePathAndRootname)

Some records coming back with "no data available" even though the records have data in ACIS: example uids: 44104, 44837(chunk 3). Others from chunk 4 (328 stations) and 5 (30) need additional QA checks and diagnosis. 360 stations total.