joshuaulrich / xts

Extensible time series class that provides uniform handling of many R time series classes by extending zoo.
http://joshuaulrich.github.io/xts/
GNU General Public License v2.0
220 stars 71 forks source link

apply.yearly starting at different month than January #347

Closed HalfEatenPie closed 3 years ago

HalfEatenPie commented 3 years ago

Description

I have a function built up using endpoints to start period.apply() from a different month. The current version I'm pasting below has been working for me, but recently I've started running into issues with this (fringe cases) and I wanted to spend some time understanding this a bit further.

I want a apply.yearly() but starting at a different month. For example, my current application requires a start from October to September, but I also recognize that my code does not work for all use cases (February being the case).

Expected behavior

Function working from any month seasons (February to January, October to September, etc.).

Minimal, reproducible example

# Custom Function
apply.wateryear <- function (x, FUN, wateryearmon = 10, on = "days", ...) 
{
  wateryearmon <- as.numeric(wateryearmon)
  yearlyendpoints <- endpoints(x, "years")
  FirstYear <- format(index(x[1]), "%Y")
  FirstDayWY <- as.numeric(format(as.Date(paste(FirstYear, 
    wateryearmon, "01", sep = "-")), "%j"))
  LastDay <- as.numeric(format(as.Date(paste(FirstYear, "12", 
    "31", sep = "-")), "%j"))
  if (on == "days" | on == "day") {
    WYAdjust <- LastDay - FirstDayWY + 1
  }
  else if (on == "months" | on == "month") {
    WYAdjust <- 12 - wateryearmon + 1
  }
  else stop("Unknown periods string (on)")
  yearlyendpoints <- yearlyendpoints - WYAdjust
  yearlyendpoints <- yearlyendpoints[yearlyendpoints > 0]
  if (yearlyendpoints[1] < 0) 
    yearlyendpoints <- c(0, yearlyendpoints)
  if (as.numeric(format(as.Date(index(x[tail(yearlyendpoints, 
    n = 1)])), "%j")) - FirstDayWY < 0) 
    yearlyendpoints <- c(yearlyendpoints[-length(yearlyendpoints)], 
      length(x))
  return(period.apply(x, yearlyendpoints, FUN, ...))
}

Sample Data

# Example Data
sampleData <- structure(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0.27, 0.02, 0, 0, 0, 0, 0, 0.53, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.78, 0.5, 0, 0, 
0.01, 0, 0.92, 0, 0, 0, 0.9, 0.34, 1.41, 0.57, 0, 0, 0, 0, 0, 
0, 0, 0.01, 0, 0, 0, 0, 0, 0, 0.11, 0.19, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.29, 0, 0, 
0, 0, 0, 0.32, 1.82, 0.01, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0.95, 0.63, 2.18, 3.57, 1.49, 0, 1.32, 1.03, 1.24, 
1.45, 0.31, 0, 0.28, 0, 0.26, 0, 0, 0.26, 0.11, 0, 0.12, 0, 0, 
0.24, 0.28, 0, 0, 0.99, 0.75, 0, 0.21, 0.04, 0.56, 0, 0, 0, 0, 
0.07, 0.06, 0, 0, 0.13, 0.03, 0.19, 0.81, 0, 0, 0, 0, 0, 0, 0.01, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.44, 
0, 0, 0, 0, 0, 0, 0, 0.45, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0.11, 0.13, 0.14, 0.49, 0.1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0.1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.05, 
0.19, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0.2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.24, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0.74, 0, 0, 0, 0, 0, 0, 0, 0, 0.03, 0, 0, 0, 0, 0, 0, 0, 0.05, 
2.15, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.03, 0.5, 0, 0, 0.03, 0, 0, 
0, 0, 0, 0, 0, 0, 0.45, 0, 0.03, 0, 0, 0, 0.2, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0.98, 0.01, 0.43, 0.04, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.02, 2.1, 4.39, 5.45, 
0.28, 0, 0, 0, 0, 0, 0, 0, 0, 0.19, 0, 0.93, 0.51, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.07, 0.02, 0.22, 
0, 0, 0, 0, 0, 0.82, 0.01, 1.09, 0.02, 0, 0, 0, 0, 0.5, 0.43, 
0, 0, 0.14, 2.75, 0, 0, 0, 0.45, 0.19, 0, 0, 0, 0.57, 0.93, 0.56, 
0.37, 0.45, 0.29, 0, 0, 0.36, 1.31, 0, 0, 0.17, 0.45, 0.62, 1.13, 
0, 0, 0, 0, 0.2, 0.08, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.06, 1.16, 
0.16, 1.25, 0.07, 0, 0, 0.25, 0, 0, 0, 0, 0, 0.06, 0, 0, 0, 0, 
0, 0.08, 0.03, 0.04, 0, 0, 0, 0, 0, 0, 0, 0.11, 0.24, 0.01, 0.01, 
0, 1.83, 0, 0.06, 0.19, 0.32, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0.11, 0, 0, 0, 0, 0.05, 0, 0, 0, 0, 0, 0, 0, 0.44, 0, 0, 
0, 0, 0.01, 0.01, 0.12, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), class = c("xts", 
"zoo"), index = structure(c(-260409600, -260323200, -260236800, 
-260150400, -260064000, -259977600, -259891200, -259804800, -259718400, 
-259632000, -259545600, -259459200, -259372800, -259286400, -259200000, 
-259113600, -259027200, -258940800, -258854400, -258768000, -258681600, 
-258595200, -258508800, -258422400, -258336000, -258249600, -258163200, 
-258076800, -257990400, -257904000, -257817600, -257731200, -257644800, 
-257558400, -257472000, -257385600, -257299200, -257212800, -257126400, 
-257040000, -256953600, -256867200, -256780800, -256694400, -256608000, 
-256521600, -256435200, -256348800, -256262400, -256176000, -256089600, 
-256003200, -255916800, -255830400, -255744000, -255657600, -255571200, 
-255484800, -255398400, -255312000, -255225600, -255139200, -255052800, 
-254966400, -254880000, -254793600, -254707200, -254620800, -254534400, 
-254448000, -254361600, -254275200, -254188800, -254102400, -254016000, 
-253929600, -253843200, -253756800, -253670400, -253584000, -253497600, 
-253411200, -253324800, -253238400, -253152000, -253065600, -252979200, 
-252892800, -252806400, -252720000, -252633600, -252547200, -252460800, 
-252374400, -252288000, -252201600, -252115200, -252028800, -251942400, 
-251856000, -251769600, -251683200, -251596800, -251510400, -251424000, 
-251337600, -251251200, -251164800, -251078400, -250992000, -250905600, 
-250819200, -250732800, -250646400, -250560000, -250473600, -250387200, 
-250300800, -250214400, -250128000, -250041600, -249955200, -249868800, 
-249782400, -249696000, -249609600, -249523200, -249436800, -249350400, 
-249264000, -249177600, -249091200, -249004800, -248918400, -248832000, 
-248745600, -248659200, -248572800, -248486400, -248400000, -248313600, 
-248227200, -248140800, -248054400, -247968000, -247881600, -247795200, 
-247708800, -247622400, -247536000, -247449600, -247363200, -247276800, 
-247190400, -247104000, -247017600, -246931200, -246844800, -246758400, 
-246672000, -246585600, -246499200, -246412800, -246326400, -246240000, 
-246153600, -246067200, -245980800, -245894400, -245808000, -245721600, 
-245635200, -245548800, -245462400, -245376000, -245289600, -245203200, 
-245116800, -245030400, -244944000, -244857600, -244771200, -244684800, 
-244598400, -244512000, -244425600, -244339200, -244252800, -244166400, 
-244080000, -243993600, -243907200, -243820800, -243734400, -243648000, 
-243561600, -243475200, -243388800, -243302400, -243216000, -243129600, 
-243043200, -242956800, -242870400, -242784000, -242697600, -242611200, 
-242524800, -242438400, -242352000, -242265600, -242179200, -242092800, 
-242006400, -241920000, -241833600, -241747200, -241660800, -241574400, 
-241488000, -241401600, -241315200, -241228800, -241142400, -241056000, 
-240969600, -240883200, -240796800, -240710400, -240624000, -240537600, 
-240451200, -240364800, -240278400, -240192000, -240105600, -240019200, 
-239932800, -239846400, -239760000, -239673600, -239587200, -239500800, 
-239414400, -239328000, -239241600, -239155200, -239068800, -238982400, 
-238896000, -238809600, -238723200, -238636800, -238550400, -238464000, 
-238377600, -238291200, -238204800, -238118400, -238032000, -237945600, 
-237859200, -237772800, -237686400, -237600000, -237513600, -237427200, 
-237340800, -237254400, -237168000, -237081600, -236995200, -236908800, 
-236822400, -236736000, -236649600, -236563200, -236476800, -236390400, 
-236304000, -236217600, -236131200, -236044800, -235958400, -235872000, 
-235785600, -235699200, -235612800, -235526400, -235440000, -235353600, 
-235267200, -235180800, -235094400, -235008000, -234921600, -234835200, 
-234748800, -234662400, -234576000, -234489600, -234403200, -234316800, 
-234230400, -234144000, -234057600, -233971200, -233884800, -233798400, 
-233712000, -233625600, -233539200, -233452800, -233366400, -233280000, 
-233193600, -233107200, -233020800, -232934400, -232848000, -232761600, 
-232675200, -232588800, -232502400, -232416000, -232329600, -232243200, 
-232156800, -232070400, -231984000, -231897600, -231811200, -231724800, 
-231638400, -231552000, -231465600, -231379200, -231292800, -231206400, 
-231120000, -231033600, -230947200, -230860800, -230774400, -230688000, 
-230601600, -230515200, -230428800, -230342400, -230256000, -230169600, 
-230083200, -229996800, -229910400, -229824000, -229737600, -229651200, 
-229564800, -229478400, -229392000, -229305600, -229219200, -229132800, 
-229046400, -228960000, -228873600, -228787200, -228700800, -228614400, 
-228528000, -228441600, -228355200, -228268800, -228182400, -228096000, 
-228009600, -227923200, -227836800, -227750400, -227664000, -227577600, 
-227491200, -227404800, -227318400, -227232000, -227145600, -227059200, 
-226972800, -226886400, -226800000, -226713600, -226627200, -226540800, 
-226454400, -226368000, -226281600, -226195200, -226108800, -226022400, 
-225936000, -225849600, -225763200, -225676800, -225590400, -225504000, 
-225417600, -225331200, -225244800, -225158400, -225072000, -224985600, 
-224899200, -224812800, -224726400, -224640000, -224553600, -224467200, 
-224380800, -224294400, -224208000, -224121600, -224035200, -223948800, 
-223862400, -223776000, -223689600, -223603200, -223516800, -223430400, 
-223344000, -223257600, -223171200, -223084800, -222998400, -222912000, 
-222825600, -222739200, -222652800, -222566400, -222480000, -222393600, 
-222307200, -222220800, -222134400, -222048000, -221961600, -221875200, 
-221788800, -221702400, -221616000, -221529600, -221443200, -221356800, 
-221270400, -221184000, -221097600, -221011200, -220924800, -220838400, 
-220752000, -220665600, -220579200, -220492800, -220406400, -220320000, 
-220233600, -220147200, -220060800, -219974400, -219888000, -219801600, 
-219715200, -219628800, -219542400, -219456000, -219369600, -219283200, 
-219196800, -219110400, -219024000, -218937600, -218851200, -218764800, 
-218678400, -218592000, -218505600, -218419200, -218332800, -218246400, 
-218160000, -218073600, -217987200, -217900800, -217814400, -217728000, 
-217641600, -217555200, -217468800, -217382400, -217296000, -217209600, 
-217123200, -217036800, -216950400, -216864000, -216777600, -216691200, 
-216604800, -216518400, -216432000, -216345600, -216259200, -216172800, 
-216086400, -2.16e+08, -215913600, -215827200, -215740800, -215654400, 
-215568000, -215481600, -215395200, -215308800, -215222400, -215136000, 
-215049600, -214963200, -214876800, -214790400, -214704000, -214617600, 
-214531200, -214444800, -214358400, -214272000, -214185600, -214099200, 
-214012800, -213926400, -213840000, -213753600, -213667200, -213580800, 
-213494400, -213408000, -213321600, -213235200, -213148800, -213062400, 
-212976000, -212889600, -212803200, -212716800, -212630400, -212544000, 
-212457600, -212371200, -212284800, -212198400, -212112000, -212025600, 
-211939200, -211852800, -211766400, -211680000, -211593600, -211507200, 
-211420800, -211334400, -211248000, -211161600, -211075200, -210988800, 
-210902400, -210816000, -210729600, -210643200, -210556800, -210470400, 
-210384000, -210297600, -210211200, -210124800, -210038400, -209952000, 
-209865600, -209779200, -209692800, -209606400, -209520000, -209433600, 
-209347200, -209260800, -209174400, -209088000, -209001600, -208915200, 
-208828800, -208742400, -208656000, -208569600, -208483200, -208396800, 
-208310400, -208224000, -208137600, -208051200, -207964800, -207878400, 
-207792000, -207705600, -207619200, -207532800, -207446400, -207360000, 
-207273600, -207187200, -207100800, -207014400, -206928000, -206841600, 
-206755200, -206668800, -206582400, -206496000, -206409600, -206323200, 
-206236800, -206150400, -206064000, -205977600, -205891200, -205804800, 
-205718400, -205632000, -205545600, -205459200, -205372800, -205286400, 
-205200000, -205113600, -205027200, -204940800, -204854400, -204768000, 
-204681600, -204595200, -204508800, -204422400, -204336000, -204249600, 
-204163200, -204076800, -203990400, -203904000, -203817600, -203731200, 
-203644800, -203558400, -203472000, -203385600, -203299200, -203212800, 
-203126400, -203040000, -202953600, -202867200, -202780800, -202694400, 
-202608000, -202521600, -202435200, -202348800, -202262400, -202176000, 
-202089600, -202003200, -201916800, -201830400, -201744000, -201657600, 
-201571200, -201484800, -201398400, -201312000, -201225600, -201139200, 
-201052800, -200966400, -200880000, -200793600, -200707200, -200620800, 
-200534400, -200448000, -200361600, -200275200, -200188800, -200102400, 
-200016000, -199929600, -199843200, -199756800, -199670400, -199584000, 
-199497600, -199411200, -199324800, -199238400, -199152000, -199065600, 
-198979200, -198892800, -198806400, -198720000, -198633600, -198547200, 
-198460800, -198374400, -198288000, -198201600, -198115200, -198028800, 
-197942400, -197856000, -197769600, -197683200, -197596800, -197510400, 
-197424000), tzone = "UTC", tclass = "Date"), .Dim = c(730L, 
1L))
apply.wateryear(sampleData, sum, 2, 'days')

Am I using endpoints correctly? Am I using period.apply correctly? What's the easiest and "better" way of generating the endpoints? I'd like to understand this further. I'd also like to make this in-line with how the other apply.monthly() functions operate. Can I get some guidance please?

Session Info

R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] parallel  grid      stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] readxl_1.3.1            VC2copula_0.1.1         VineCopula_2.4.1        copula_1.0-1            SpatialExtremes_2.0-9  
 [6] lmomco_2.3.6            PearsonDS_1.1           fitdistrplus_1.1-3      survival_3.1-12         MASS_7.3-51.6          
[11] RColorBrewer_1.1-2      ggsci_2.9               reshape2_1.4.4          lubridate_1.7.9           
[16] directlabels_2020.12.29              doParallel_1.0.16       iterators_1.0.13        foreach_1.5.1          
[21] gridExtra_2.3           xts_0.12.1              zoo_1.8-8               forcats_0.5.0           stringr_1.4.0          
[26] dplyr_1.0.2             purrr_0.3.4             readr_1.3.1             tidyr_1.1.0             tibble_3.0.3           
[31] ggplot2_3.3.2           tidyverse_1.3.0        

loaded via a namespace (and not attached):
 [1] colorspace_1.4-1    ellipsis_0.3.1      class_7.3-17        fs_1.4.1            rstudioapi_0.13     farver_2.0.3       
 [7] gsl_2.1-6           fansi_0.4.1         mvtnorm_1.1-1       xml2_1.3.2          codetools_0.2-16    splines_4.0.2      
[13] Lmoments_1.3-1      spam_2.6-0          jsonlite_1.6.1      broom_0.7.2         dbplyr_2.0.0        stabledist_0.7-1   
[19] compiler_4.0.2      httr_1.4.2          backports_1.1.8     assertthat_0.2.1    Matrix_1.2-18       cli_2.2.0          
[25] tools_4.0.2         dotCall64_1.0-0     gtable_0.3.0        glue_1.4.1          maps_3.3.0          tinytex_0.24       
[31] Rcpp_1.0.4.6        cellranger_1.1.0    vctrs_0.3.2         nlme_3.1-148        xfun_0.15           metR_0.9.0         
[37] rvest_0.3.6         lifecycle_0.2.0     goftest_1.2-2       scales_1.1.1        gstat_2.0-6         hms_0.5.3          
[43] fields_11.6         curl_4.3            quantmod_0.4.18     memoise_1.1.0       reshape_0.8.8       stringi_1.4.6      
[49] maptools_1.0-2      pcaPP_1.9-73        e1071_1.7-4         checkmate_2.0.0     TTR_0.24.2          intervals_0.15.2   
[55] rlang_0.4.7         pkgconfig_2.0.3     lattice_0.20-41     hydroGOF_0.4-0      labeling_0.4.2      tidyselect_1.1.0   
[61] hydroTSM_0.6-0      plyr_1.8.6          magrittr_2.0.1      R6_2.5.0            generics_0.1.0      ADGofTest_0.3      
[67] automap_1.0-14      DBI_1.1.0           pillar_1.4.7        haven_2.3.1         foreign_0.8-80      withr_2.3.0        
[73] mgcv_1.8-31         sp_1.4-4            spacetime_1.2-3     pspline_1.0-18      modelr_0.1.8        crayon_1.3.4       
[79] utf8_1.1.4          isoband_0.2.3       data.table_1.13.4   FNN_1.1.3           reprex_0.3.0        digest_0.6.25      
[85] numDeriv_2016.8-1.1 stats4_4.0.2        munsell_0.5.0       quadprog_1.5-8     
joshuaulrich commented 3 years ago

I'm closing this because it's not an issue with the package, so there's nothing for me to fix. This would also be better to ask on R-SIG-Finance or StackOverflow. That said, I will still try to help.

It's not clear to me what you're trying to do. The end points you pass to period.apply() should be the last observation in the aggregation period. The vector should always start with 0 and end with nrow(x). All the other points are up to you to calculate correctly.

Also, you can use .indexyear(), .indexyday(), and the other .index*() functions to make some of your calculations easlier.