loelschlaeger / fHMM

Hidden Markov models for finance
https://loelschlaeger.de/fHMM/
GNU General Public License v3.0
16 stars 8 forks source link

Error when specifying `states = 5` #83

Closed zoushucai closed 11 months ago

zoushucai commented 1 year ago

A standard example comes from the fHMM package

library(fHMM)
controls <- list(
  states = 3,
  sdds   = "t",
  data   = list(
    file        = dax,
    date_column = "Date",
    data_column = "Close",
    logreturns  = TRUE,
    from        = "2000-01-03",
    to          = "2022-12-31"
  ),
  fit    = list(runs = 200)
)
controls <- set_controls(controls)
dax_data <- prepare_data(controls)
dax_model_3t <- fit_model(dax_data)

This is no problem, but if I set the hidden state to 5, an error will be reported

library(fHMM)
controls <- list(
  states = 5,
  sdds   = "t",
  data   = list(
    file        = dax,
    date_column = "Date",
    data_column = "Close",
    logreturns  = TRUE,
    from        = "2000-01-03",
    to          = "2022-12-31"
  ),
  fit    = list(runs = 200)
)
controls <- set_controls(controls)
dax_data <- prepare_data(controls)
dax_model_3t <- fit_model(dax_data)

The error message is as follows

Checking start values                        
Error: 'Gamma' must be a tpm of dimension 'controls$states[1]'.
loelschlaeger commented 1 year ago

Hi @zoushucai !

Thanks for the report. Unfortunately, I was unable to reproduce the error.

Did you define the dax object as in the README via dax <- download_data(symbol = "^GDAXI", file = NULL, verbose = FALSE)?

Can you please try fit_model(dax_data, seed = 1) and see if the error still occurs?

zoushucai commented 1 year ago

Hello, @loelschlaeger Thank you for your reply. the daxdata I use comes from the fHMM package. Ifdax <- download_data(symbol = "^GDAXI", file = NULL, verbose = FALSE) is used to download the data, an error will also be reported, Similarly, I tried it out. If I use state>3, it seems that all errors are reported.

A complete case is as follows

rm(list = ls())
library(fHMM)
dax <- download_data(symbol = "^GDAXI", file = NULL, verbose = FALSE)
# dax = na.omit(dax) # Deleting the NA value seems unrelated to the error
str(dax)
controls <- list(
  states = 5,
  sdds   = "t",
  data   = list(
    file        = dax,
    date_column = "Date",
    data_column = "Close",
    logreturns  = TRUE,
    from        = "2000-01-03",
    to          = "2022-12-31"
  ),
  fit    = list(runs = 200)
)
controls <- set_controls(controls)
dax_data <- prepare_data(controls)
model  <- fit_model(dax_data, seed = 1)

Output Results

> rm(list = ls())
> library(fHMM)
Thanks for using {fHMM} version 1.1.0!
With {fHMM}, you can fit (H)HMMs to financial data.
See https://loelschlaeger.de/fHMM for help.
Type 'citation("fHMM")' for citing this R package.
> dax <- download_data(symbol = "^GDAXI", file = NULL, verbose = FALSE)
> # dax = na.omit(dax) # Deleting the NA value seems unrelated to the error
> str(dax)
'data.frame':   9079 obs. of  7 variables:
 $ Date     : chr  "1987-12-30" "1987-12-31" "1988-01-01" "1988-01-04" ...
 $ Open     : num  1005 NA NA 956 996 ...
 $ High     : num  1005 NA NA 956 996 ...
 $ Low      : num  1005 NA NA 956 996 ...
 $ Close    : num  1005 NA NA 956 996 ...
 $ Adj.Close: num  1005 NA NA 956 996 ...
 $ Volume   : int  0 NA NA 0 0 0 0 0 0 0 ...
> controls <- list(
+   states = 5,
+   sdds   = "t",
+   data   = list(
+     file        = dax,
+     date_column = "Date",
+     data_column = "Close",
+     logreturns  = TRUE,
+     from        = "2000-01-03",
+     to          = "2022-12-31"
+   ),
+   fit    = list(runs = 200)
+ )
> controls <- set_controls(controls)
> dax_data <- prepare_data(controls)
> model  <- fit_model(dax_data, seed = 1)
Checking start values                        
Error: 'Gamma' must be a tpm of dimension 'controls$states[1]'.

The environment in which I run is as follows: (mac Ventura 13.2.1 m1 )

> sessionInfo()
R version 4.2.2 (2022-10-31)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.2.1

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] fHMM_1.1.0

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.10       codetools_0.2-19  prettyunits_1.1.1 foreach_1.5.2    
 [5] crayon_1.5.2      R6_2.5.1          lifecycle_1.0.3   rlang_1.1.0      
 [9] progress_1.2.2    cli_3.6.1         rstudioapi_0.14   vctrs_0.6.1      
[13] ellipsis_0.3.2    iterators_1.0.14  tools_4.2.2       hms_1.1.2        
[17] compiler_4.2.2    pkgconfig_2.0.3  

If states=3, the operation will succeed, but it will take a long time (I won't demonstrate it here). I can show the running status.

rm(list = ls())
library(fHMM)
dax <- download_data(symbol = "^GDAXI", file = NULL, verbose = FALSE)
# dax = na.omit(dax) # Deleting the NA value seems unrelated to the error
str(dax)
controls <- list(
  states = 3,
  sdds   = "t",
  data   = list(
    file        = dax,
    date_column = "Date",
    data_column = "Close",
    logreturns  = TRUE,
    from        = "2000-01-03",
    to          = "2022-12-31"
  ),
  fit    = list(runs = 200)
)
controls <- set_controls(controls)
dax_data <- prepare_data(controls)
model  <- fit_model(dax_data, seed = 1)

output:

> model  <- fit_model(dax_data, seed = 1)
Checking start values                        
Maximizing likelihood                        
[=>---------------------------]   6%, 18m ETA
loelschlaeger commented 1 year ago

It seems there is a problem when generating the starting values for the numerical likelihood optimization. On Windows OS, it works.

@timoadam can you please try to reproduce the error on macOS?

loelschlaeger commented 11 months ago

Fixed with package version 1.2.0.