marchtaylor / sinkr

A collection of functions with emphasis on multivariate methods and handling of geographic datasets
GNU General Public License v3.0
36 stars 12 forks source link

Issue with fieldAnonaly applied on daily data #1

Closed pascaloettli closed 8 years ago

pascaloettli commented 8 years ago

I think the fieldAnomaly function is not correct for daily dataset.

# Generating daily dates between 1983-01-01 and 1984-12-31
# with a leap year (1984)

x <- as.POSIXlt(seq.Date(as.Date("1983-01-01"), as.Date("1984-12-31"), by="day"))

level="daily"
if(level=="daily"){levs=unique(x$yday)}

levs_lookup=vector("list", length(levs))
names(levs_lookup) <- levs
for(i in 1:length(levs)){
  if(level=="daily"){levs_lookup[[i]] <- which(x$yday == names(levs_lookup[i]))}
}

Here is the problem. For the same index (60), we get different dates:

j <- 60
x[levs_lookup[[j]]]
# [1] "1983-03-01 UTC" "1984-02-29 UTC"

And here, only December 31st, 1984

j <- 366
x[levs_lookup[[j]]]
# [1] "1984-12-31 UTC"
marchtaylor commented 8 years ago

Thanks for your comment. This could be solved by using text strings for days:

x <- as.POSIXlt(seq.Date(as.Date("1983-01-01"), as.Date("1984-12-31"), by="day"))

level="daily"
if(level=="daily"){levs=levs=unique(format(x, format="%m%d"))}

levs_lookup=vector("list", length(levs))
names(levs_lookup) <- levs
for(i in 1:length(levs)){
  if(level=="daily"){levs_lookup[[i]] <- which(format(x, format="%m%d") == names(levs_lookup[i]))}
}

j <- 60
x[levs_lookup[[j]]]
# [1] "1983-03-01 UTC" "1984-03-01 UTC"

j <- 366
x[levs_lookup[[j]]]
# [1] "1984-02-29 UTC"

But, I think I may still leave the function as julian day and specify this more clearly in the function description.

Cheers

marchtaylor commented 8 years ago

I've updated the function and added some additional examples. There does indeed seem to be some issues with using julian day (or else I have made a mistake with my example). I would have thought that subtracting the julian day means would bring the anomalies closer to zero, but in fact the month-date combination proved better (see last example).

marchtaylor commented 8 years ago

My example was a bit incorrect. I've provided a more straightforward example containing a yearly signal and a monthly one based on julian day. The function can now correctly remove means by month, date, or julian day