Closed sklesse closed 4 years ago
Stefan, I was meaning to tackle this but the timing was bad! Love to include this in dplR. And I've got some time to putter on the package now as I clean things up for doing substantial work. Do you want to add your code as a contributor here? Or send it to me and I'll wrangle it in and add you directly.
Hi Andy, I think it would be great to add it to dplR. So far, it's written for ts() formatted objects, but I should be able to rearrange most of it to be compatible with dplR-native data.frames in the next days. I will upload the code here.
Cool. Go for it. If it works on class ts
then wrangling it into rwl
should be trivial as rwl
is just a data.frame
with the years as the rownames
. You can get the years out of a rwl
object using time(rwl)
as needed.
Ok, here it is. I just had to rewrite some lines to get rid of the ts
classes.
The function was originally written by David Frank. It follows Frank et al. (2006, TRACE proceedings) or Frank et al. (2007, GRL) and returns uncorrected chronologies equal to chron(x, biweight=F or T)
and their simple rbar corrected (Osborn et al., 1997, Figure 2d) as well as running rbar adjusted counterparts.
I only modified a couple of lines for speed purposes.
The actual desired output would be column 6, the bi-weight running rbar adjusted chronology. One could delete the other lines referring to simple rbar adjusted chronologies, or keep them. Corresponding to the chron
function one could also insert some flags in the function arguments, e.g. biweight=T/F
, running.rbar=T/F
and only return two of those chronologies (uncorrected and corrected) instead of all six variants.
require(zoo) #for rollapply
#x=detrended data.frame as returned from detrend()
#WL= window length for running rbar
#WL default set to be 51 years, Frank et al. 2007 use 100 years
stabilizevariance <- function(x,WL=51) {
observationoverlap <- function(x) {
presencematrix <- 0*(as.matrix(x))+1
presencematrix[is.na(presencematrix)] <- 0
overlapmatrix <- t(presencematrix)%*%presencematrix
overlapmatrix
}
runningrforSTABILIZEVARIANCE <-function (data, win.length=WL) {
cor.mat <- cor(data, use="pairwise.complete.obs")
diag(cor.mat) <- NA
obs.overlap.mat <- observationoverlap(data)
cor.mat[obs.overlap.mat < (win.length/3)] <- NA
MEAN.R.trunc <- mean(cor.mat, na.rm=TRUE)
return(MEAN.R.trunc)
}
WL<-WL
mean.x <-mean(rowMeans(x,na.rm=TRUE,dims=1)) #pre-process the data to have a mean of zero
x <- x-mean.x
samplesize <- rowSums(!is.na(x))
chronology <- rowMeans(x,na.rm=T)
biwgt.chronology <- apply(x,1,tbrm)
correlmatrix <- cor(x,use="pairwise.complete.obs")
diag(correlmatrix) <- NA
MEAN.R <- mean(correlmatrix, na.rm =TRUE) #rbar
rbar.running <- rollapply(x,WL,runningrforSTABILIZEVARIANCE,by=1,by.column=FALSE,fill = NA)
rbar.running <- na.locf(rbar.running,na.rm=FALSE) #fill last WL/2 values
rbar.running <- rev(na.locf(rev(rbar.running),na.rm=FALSE)) #fill first WL/2 values
rbar.running[samplesize==0] <- NA
effsamplesize <- samplesize/(1+(samplesize-1)*rbar.running)
effsamplesize <- pmin(effsamplesize,samplesize,na.rm=TRUE)
# takes care of setting the effsamplesize to 1 when sampledepth=1
# and also if rbar goes negative effsamplesize gets larger than samplesize, and this brings it back down.
simpleeffsamplesize <- samplesize/(1+(samplesize-1)*MEAN.R)
RUNNINGadjustedchronology <- chronology*(effsamplesize*MEAN.R)^.5
RUNNINGadjusted.biwgt.chronology <- biwgt.chronology*(effsamplesize*MEAN.R)^.5
SIMPLEadjustedchronology <- chronology*(simpleeffsamplesize*MEAN.R)^.5
SIMPLEadjusted.biwgt.chronology <- biwgt.chronology*(simpleeffsamplesize*MEAN.R)^.5
everything <- cbind(chronology,biwgt.chronology,SIMPLEadjustedchronology,SIMPLEadjusted.biwgt.chronology,RUNNINGadjustedchronology,RUNNINGadjusted.biwgt.chronology,samplesize,effsamplesize,rbar.running)
rownames(everything)<-rownames(x)
everything <- scale(everything,center=c(rep(-mean.x,6),rep(0,3)),scale=FALSE)#add back the mean to the dataset
return(everything)
}
ts.plot(stabilizevariance(x,WL=51)[,c(2,4,6)],col=1:3)
Great. I'll port this over. Thanks.
@sklesse -- While I didn't forget about this, I didn't get around it. Should have some time soon.
Ok. Check out the chron.stabilize
function in the repo. Just committed it. Pretty rough now. Let's discuss.
Hi Andy, looks great and is very fast! :-)
One little "error" is in line 34
for(i in 1:(nrow(x0)-winLength+1)){
your initial code has for(i in 1:(nrow(x0)-winLength-1)){
This results in 2 pairs of rbar not being evaluated. In the default setting (winLength=51) there should be 25 NAs in movingRbarVec
at the end. With your code, there are 27. changing -1 to +1 in the for-loop does the trick.
Cool. Thanks for that @sklesse.
Questions.
I think, there should be some bounds or at least warnings on window length. E.g., what should be the min/max length? E.g., min of 10% max of 50% of chron length? And should the function allow even numbers to be passed in and just increment them +1? The reason for having the number be odd won't be obvious to the users.
Next, should the corMat[overlapMatrix < (WL/3)] <- NA
on line 15 be hard coded as 1/3 or can that be an argument to the function?
And help file will need work. Can you send some verbiage?
I think that's it for now...
Those are good questions.
I think a hard minimum of 31 years wouldn't harm - at least with a warning flag if one chooses a shorter window. Computing correlations over shorter periods is not very powerful and might even produce artifacts (this is just a gut feeling - I am not aware of a formal test on this matter). What are the most commonly used rbar/eps-windows in ARSTAN/dplR? I'd say between 30 and 50 years. So, the 51-year default is a good starting point. In David's GRL Paper they use a 100-window for a 1000-year long chronology. In the supplement, they show how the chronology would differ when using a 50- or-200 year window. The former (latter) keeps (loses) more low frequency in the early low replicated part.
So, yeah, maybe a 50% max-value warning as you suggested also wouldn't harm.
odd/even numbers: Why not? We would just have to add: ` movingRbarVec <- rep(NA,nrow(x0))
if(winLength%%2 != 1){
for(i in 1:(nrow(x0)-winLength+1)){
movingRbarVec[i+(winLength-1)/2] <- rbarWinLength(x0[i:(i+winLength-1),])
}
}else{
for(i in 1:(nrow(x0)-winLength)){
movingRbarVec[i+(winLength)/2] <- rbarWinLength(x0[i:(i+winLength),])
}
}
`
corMat: I would leave it hard-coded. There should be a certain overlap with the chosen window for the correlation pair to be considered.
I will have a look at the help file later today. Cheers, Stefan
Ok. I added in warnings for short (<30) or long (>50%) winLength
s and an error if winLength
is > than nrow(x)
. I changed the odd/even winLength as above and simplified the padding of the NAs. Can you double check the indexing on the odd/even split. E.g.,
data(co021)
co021.rwi <- detrend(co021,method = "Spline")
# create a little subset for tests
foo <- co021.rwi[301:400,]
#drop empty cols
foo <- foo[,!is.na(colSums(foo))]
# check with a silly short winLength -- note warning
foo2 <- chron.stabilized(foo,winLength=10,running.rbar = TRUE)
head(foo2,10) # first four values get padded.
tail(foo2,10) # last five values get padded.
foo2 <- chron.stabilized(foo,winLength=11,running.rbar = TRUE)
head(foo2,10) # first five values get padded.
tail(foo2,10) # last six values get padded. Should be last five?
I'm closing this as an issue -- it's in the dev version and almost tidied up.
Are there any intentions to include variance stabilization following Frank et al. (2007, GRL)? I only have a running version that relies on the ts() class, but I guess that could be easily rewritten to fit the "rwl" class objects.
Cheers, Stefan