tidyverts / feasts

Feature Extraction And Statistics for Time Series
https://feasts.tidyverts.org/
291 stars 23 forks source link

arch_lm perfect fit #85

Closed mitchelloharawild closed 4 years ago

mitchelloharawild commented 4 years ago

@robjhyndman What would you expect from this feature's output?

Issue with this series is that it drops to zero after 12 observations.


library(tsibbledata)
library(feasts)
#> Loading required package: fabletools
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
PBS %>% 
  filter(ATC2 == "D04", Concession == "General", Type == "Co-payments") %>% 
  features(Cost, stat_arch_lm)
#> Warning in summary.lm(fit): essentially perfect fit: summary may be unreliable
#> # A tibble: 1 x 5
#>   Concession Type        ATC1  ATC2  stat_arch_lm
#>   <chr>      <chr>       <chr> <chr>        <dbl>
#> 1 General    Co-payments D     D04            NaN

Created on 2020-01-06 by the reprex package (v0.3.0)

robjhyndman commented 4 years ago

It's a perfect fit, so R^2 should be 1. Therefore return 1.

mitchelloharawild commented 4 years ago

Thought as much.

I also noticed that the condition is length(x) <= 13, shouldn't this be lag+1? This is something copied over from tsfeatures. Further, do you think a better handling of default lags would be useful for small sample sizes?

https://github.com/tidyverts/feasts/blob/8dd8103834b3aed34add35f747a8e226cd1eeeaa/R/features.R#L18-L22

robjhyndman commented 4 years ago

Yes, that it should be lags+1. I'm not familiar with this test (and I didn't write the code) so I'm not sure what is recommended for small sample sizes. I'm also not sure where the default of 12 came from. The original paper is Engle (1982) [https://www.jstor.org/stable/1912773].