Open ceholden opened 8 years ago
One assumption I'm making in this is that calculating recursive residuals and then the MOSUM calculation is the most expensive bit of running BFAST. Does anyone have a sense of if this is true from benchmarks?
Thanks Chris, and Eliakim (@hamun001 on github) and Achim Zeileis (not on github - but will pas on this conversation) can comment here. As for bfastmonitor() we can start with just the MOSUM implementation a go from there (see also Eliakim's implementation of bfastmonitor() on the google Earth Engine, javascript). bfastmonitor() is more simple/efficient when compared to bfast() as there no iterative decomposition, and break dating per component done. bfastmonitor() fits as seasonal trend model at once and uses the monitor() function for the strucchange package. Once MOSUM works, we should certainly try to implement the CUSUM based tests.
Feedback from Achim: " As for the costs: Everything based on standard OLS residuals is fairly cheap. Recursive or moving/rolling sums cost a little bit but I presume that there are already canned solutions for this in Python - like cumsum() in R. Recursive residuals and recursive or moving/rolling parameter estimates are more costly and take more time. And if you move to full estimation of breakpoints, this is of order O(n^2)."
Thanks a lot Jan & Achim! I guess I'm mostly interested in implementing these in Python to forward the ensemble ideas we discussed yesterday and I've seen certain kinds of data work better with other approaches (L-HV timeseries, for example, work pretty well just using CUSUM).
Speed is a secondary concern and at the end of the day most computation is done via LAPACK or BLAS or whatever in R and Python, so maybe there won't be such a speed improvement. Achim's comments about the computation complexity certainly helps as I'm not so great at thinking on that. I'm very interested to mine as many approaches from real statisticians as possible so a connection to him through you is great!
This has to be a lower priority for me, but let's use this ticket to keep track of things. I have recresid
implemented already with a test, but I'll switch the test data out to something remotely sensed instead of the airquality dataset. As for a plan,
recresid
test to use NDVI timeseries or somethingstrucchange
OLS-MOSUM is easy after this is done.strucchange
bfastmonitor
historic period finding methods like ROC testbfastmonitor
Would be nice to implement CUSUM or MOSUM test statistics in
YATSM
seeing as we already have the recursive residual calculation implemented and validated against thestrucchange
package.We could test and implement the CUSUM or MOSUM on top of recursive residuals or simple OLS residuals to further mirror the
strucchange
package. From there it shouldn't (?) be too hard to go all the way to a full BFAST / BFAST monitor implementation.I figure we should at least get to the MOSUM calculation so we can benchmark computation performance in R versus Python. Depending on the results, maybe there'd be some more interest in implementing all the bells and whistles in BFAST.
Thoughts? @verbe039, @dutri001, @bendv, et al