Closed djhocking closed 10 years ago
The newest versions of the ...Rows functions should work for what you are describing. Earlier version had an error. Call if it would help to talk this over
On Thu, Oct 30, 2014 at 11:01 PM, Daniel J. Hocking < notifications@github.com> wrote:
Right now when there is data the code does predictions using the autoregressive function. However, for predicting across the entire daymet record the code just predicts the trend as if there was no error (predicts without autoregressive on the residuals because there are no residuals). However, for some dates and sites across that daymet record their are data so we could use the more accurate autroregressive.
To do this however we have to either adjust the firstObsRows and evalRows functions, the data prep functions to combine the daymet data with the observed data, or use an ifelse() statement in the predictTemp function to do one thing if the date-site is in the observed record and something else if not.
— Reply to this email directly or view it on GitHub https://github.com/Conte-Ecology/conteStreamTemperature/issues/16.
Silvio O. Conte Anadromous Fish Research Center, U.S. Geological Survey P.O. Box 796 -- One Migratory Way Turners Falls, MA 01376 (413) 863-3803 Cell: (413) 522-9417 FAX (413) 863-9810
ben_letcher@usgs.gov bletcher@eco.umass.edu http://www.lsc.usgs.gov/?q=cafb-research
I have the most up to date version of the functions. The problem is that there is no temp
column in the daymet data because there are no observations. So when I run
createEvalRows <- function(data) {
#data$rowNum <- 1:dim(data)[1]
evalRows <- data %>%
group_by(deployID) %>%
filter(date != min(date) & !is.na(temp)) %>%
select(rowNum)
return( evalRows$rowNum ) # this can be a list or 1 dataframe with different columns. can't be df - diff # of rows
}
dplyr can't look in the temp
column because it doesn't exist. I could add a temp
column and fill it with NA
. That would make every row a firstObsRow
. That would be okay if we wanted to predict the trend not accounting for the correlation in the residuals with the AR1 coefficient. This makes sense when there are no observations because then there are no residuals. I see two problems with this approach:
I'm not sure yet the best approach. Calculate both then join them, keeping the observed predictions when available or merging the observed and daymet before calculating the firstObsRows
and evalRows
and doing the predictions. I think the latter is probably best but I'm not sure the best way to do this. It will also likely have to get done in small chunks because I ran out of memory and crashed my laptop when trying to doing the daily predictions over the daymet range for just the observed sites in MA.
I may have just found an easy solution using options in Kyle's readStreamTempData
function:
covariateData <- readStreamTempData(timeSeries=FALSE, covariates=TRUE, dataSourceList=dataSource, fieldListTS=fields, fieldListCD='ALL', directory=dataInDir)
observedData <- readStreamTempData(timeSeries=TRUE, covariates=FALSE, dataSourceList=dataSource, fieldListTS=fields, fieldListCD='ALL', directory=dataInDir)
climateData$site <- as.character(climateData$site)
tempData <- left_join(climateData, select(covariateData, -Latitude, -Longitude), by=c('site'))
tempData <- left_join(tempData, select(observedData, agency, data, AgencyID, site, temp), by = c("site", "date"))
tempDataBP <- left_join(tempData, springFallBPs, by=c('site', 'year'))
The idea is that I can get the site covariate (landscape) data separately from the observed temperature data, then join them independently to the climate data. Observed temp
is NA
for most of the records which should work appropriately with the firstObsRows
and evalRows
functions.
Right now when there is data the code does predictions using the autoregressive function. However, for predicting across the entire daymet record the code just predicts the trend as if there was no error (predicts without autoregressive on the residuals because there are no residuals). However, for some dates and sites across that daymet record there are data so we could use the more accurate autroregressive.
To do this however we have to either adjust the
firstObsRows
andevalRows
functions, the data prep functions to combine the daymet data with the observed data, or use anifelse()
statement in thepredictTemp
function to do one thing if the date-site is in the observed record and something else if not.