steve-the-bayesian / BOOM

A C++ library for Bayesian modeling, mainly through Markov chain Monte Carlo, but with a few other methods supported. BOOM = "Bayesian Object Oriented Modeling". It is also the sound your computer makes when it crashes.
GNU Lesser General Public License v2.1
35 stars 14 forks source link

bsts hangs with MonthlyAnnualCycle and short data #46

Closed steve-the-bayesian closed 4 years ago

steve-the-bayesian commented 4 years ago

I've found a couple of series that can be modeled with only a monthly-annual cycle component and as little as 34 days, but a series with too few days seems to get stuck in a loop. The following example hangs when trying to model 60 days.

library(bsts)

data <- zoo(runif(100, 0, 10), seq.Date(from = as.Date("2020-01-01"), by = 1, length.out = 100))

for (i in length(data):1) { print(i) state <- AddMonthlyAnnualCycle(state.specification = list(), y = data[1:i]) bsts(data[1:i], state.specification = state, niter = 10, ping = 0) } It could be as simple as a test for length of y in the AddMonthlyAnnualCycle() function, but because I've found spots where it didn't hang until there were even fewer observations, I'm unsure. I assume it might be a matter of needing to have the entirety of at least one full calendar month, with a day on either side? If so, the test would need to be for that.


After testing, it looks like it does require a full calendar month and one day before and after to that calendar month. The following runs:

library(bsts) data <- zoo(runif(100, 0, 10), seq.Date(from = as.Date("2019-12-31"), by = 1, length.out = 33)) state <- AddMonthlyAnnualCycle(state.specification = list(), y = data) bsts(data, state.specification = state, niter = 10, ping = 0) Shortening the data by one observation on either end will cause it to hang. The following logical statement tests this on a tibble of data:

!tbl_data %>% dplyr::count(month = lubridate::floor_date(index, unit = "month")) %>% dplyr::mutate( days_in_month = lubridate::days_in_month(month), test = ifelse(n == days_in_month & dply::lag(n) > 0 & dply::lead(n) > 0, TRUE, FALSE) ) %>% dplyr::pull(test) %>% any(na.rm = TRUE) (I realize that it relies on dplyr, lubridate, and tibble, but since those are all already loaded in fable.bsts it isn't additional overhead. I simply haven't thought through a method using base R or just lubridate)