Closed behinger closed 5 years ago
Thank you for the feedback, @behinger! I am working on the docs just now and your comments really help to figure out what to bring to the fore. I should definitely do a vignette introducing the "segments list" and what happens behind the scenes (see #35). A website is in the works and the "articles" menu may contain a few of the things you seek: https://lindeloev.github.io/mcp. I'll let you know when I've written it up.
Yes, change points can occur at "times" with no data because they are modeled as continuous on par_x
. This means that if you have e.g. x = 0, 1, 2, 10, 11, 12
and the associated y = 0, 0, 0, 1, 2, 3
and you model it as a plateau followed by a joined slope (y ~ 1, 1 ~ x)
, the optimal guess would be that the change happened at x = 9
, even though that is not observed. Do you think that this is desirable or whether there is a use case for assigning probability-of-change-point only to observed x-values? TBH, this was just the easiest solution to implement.
For return-to-previous-plateau, simply do prior = list(int_3 = "int_1")
. That is, you have a 100% prior belief that they are identical. If you want it to just be in the vicinity, do prior = list(int_3 = "dnorm(int_1, 0.001)
; i.e. heavy shrinkage towards the mean of int_1
. In both cases, int_1
will be just as much affected about what happens in segment 3 as the other way around. Is this something like what you asked?
Yes, change point analysis is a big thing for (autocorrelated) time-series such as stocks, the stability of critical systems, etc. This is certainly possible to implement, though I have to get past a forest of low-hanging fruit before I get there ;-)
Thanks again!
@behinger Do you recall which packages you had to update? I should update the DESCRIPTION to require up-to-date packages.
If in Rstudio, you can type install.packages
in the console, press CTRL + UP, and see your recent command history.
Re package versions: No problem, I might just add the current CRAN version of all packages as dependencies, even if some older ones may work. Thanks for catching this.
Could you say a bit more about how the change points appear discrete? I'm sure many others will have the same thoughts, so it would be good to address it in advance. A few comments:
sigma
). This should increase the width of the posterior.I have a hard time wrapping my head around what a "smooth change point" would be. Could you say more? Or does it pertain to the above?
I must admit that the use of priors to do these things feel like a divine revelation :-)
Sure! An example: Here you can see that the changepoints are discrete. Maybe to change my question: What determines the distance the changepoints are separated from each other? (i.e. the distance between the vertical lines).
Smooth changepoint:
This timeseries has no simulated autocorrelation.The changepoint is a sigmoid with 4 samples width or so. I used the sigmoid because I can differentiate it and thus estimate it using STAN/NUTS
Great catch, that is indeed confusing! This is just plot.mcpfit
defaulting to evaluating 100 positions along x for computational reasons. Increasing to 1000 goes from this:
to this:
However, it comes at a computational cost (it's slow) and I don't like putting in an extra argument to plot
. I will try and make a solution where it selectively increases the resolution around change points.
OK, all documentation has now been updated. Based on your comments, I added an article about the formula syntax: https://lindeloev.github.io/mcp/articles/formulas.html.
Updated README, now that a lot has been separated out into vignettes/articles: https://github.com/lindeloev/mcp Updated site (frontpage is just the README): https://lindeloev.github.io/mcp/
I also increased the general resolution of plot(fit)
four-fold as a temporary fix. And added some demo datasets, so people can get up and running quicker.
Getting close to release of 0.1!
Wow very nice!
I found this syntax a bit confusing:
Segments:
response ~ 1
response ~ 1 ~ 0 + time
response ~ 1 ~ 1 + time
simply because it differs from the list you put in.
But besides this its a very cool package!! Congratulations and thanks a lot.
Thanks! OK, yes. If others raise this as well I would not oppose changing it since you could always derive one representation from the other. It is to enable multivariate change points and variance-change change points in the future (https://github.com/lindeloev/mcp/issues/23) and many other unforeseen things.
cool! Now I am looking forward to get some data with a changepoint ;-)
Hey! Great package! Initially I had trouble to install because some of my packages were outdated. But after updating everything it ran pretty smooth.
It took me a bit to understand the logic of the list. I think a simple comment in the quick-start would fix this (e.g. "between each entry (a "segment") of the list, a changepoint is modelled"). After understanding this, the toolbox was intuitive to use
readme: sampling the prior: empty = mcp(segments, sample=FALSE) Here it is implicitly assumed that segments defines "x" somehow.
I think I'm a bit confused what the underlying model is. I get discrete changepoints but at points where there are no samples. This is still confusing me tbh.
the rel(1) command lets you parameterize the parameter relative to the last segment. Is it also possible to parameterize to any other ones? I am thinking of a situation where two changepoints define a plateau that is different and then going back to the initial value. i.e. In this example, I might want to assume that the first and last segment / plateau have identical parameters (or at least put a prior that the difference is quite small)
Pretty good job! worked fine for me so far. I only ran it on simulated data, I have to check for real data :-) My problem with real data is that I usually have strong autocorrelation, i.e. changes are not really discrete hinges, but smoothed over time. I guess one could fit plateaus & slopes, but still no smoothness in the fit.