beanumber / tidychangepoint

Changepoint detection with a tidy interface
https://beanumber.github.io/tidychangepoint/
Other
0 stars 1 forks source link

discrepancy between regions returned by changepoints and regions returned by tidychangepoints #108

Open beanumber opened 1 week ago

beanumber commented 1 week ago

Note that the first region is correct, but the others are off by a rounding error.

  library(tidychangepoint)
  y <- segment(DataCPSim, method = "pelt", penalty = "BIC")
  y$segmenter@param.est
#> $mean
#> [1]  35.28356  58.19948  96.76671 156.51950
#> 
#> $variance
#> [1]  126.8758  370.5227  920.9762 2405.9745
  y$model$region_params
#> # A tibble: 4 × 3
#>   region        param_mu param_sigma_hatsq
#>   <chr>            <dbl>             <dbl>
#> 1 [0,547)           35.3              127.
#> 2 [547,822)         58.1              372.
#> 3 [822,972)         96.7              924.
#> 4 [972,1.1e+03]    156.              2442.

  library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
  DataCPSim |>
    as_tibble() |>
    mutate(
      region = rep(1:4, times = diff(c(0, changepoints(y), nobs(y$segmenter)))),
      id = row_number()
    ) |>
    group_by(region) |>
    summarize(
      N = n(), first = min(id), last = max(id), mean = mean(value), var = var(value)
    )
#> # A tibble: 4 × 6
#>   region     N first  last  mean   var
#>    <int> <int> <int> <int> <dbl> <dbl>
#> 1      1   547     1   547  35.3  127.
#> 2      2   275   548   822  58.2  372.
#> 3      3   150   823   972  96.8  927.
#> 4      4   124   973  1096 157.  2426.

Created on 2024-06-17 with reprex v2.1.0

changepoint appears to be including the changepoints as the closed right end of the intervals, whereas we are using it as the closed left end.

beanumber commented 1 week ago

This may reduce to #60