MatthieuStigler / tsDyn

tsDyn
tsdyn.googlecode.com
33 stars 20 forks source link

selectLSTAR question. #55

Open AlexGron opened 7 months ago

AlexGron commented 7 months ago

Hi again Matthieu! Thanks for answering me so quickly on the mailing list! Here is the reproducible code:

I'm working on an out-of-sample, rolling window & adaptive fit predictive LSTAR model which uses monthly index logged returns.

--------------Reproducible error and changes: After doing some modification on the data processing I noticed continuously compounded return helped a but but not quite. In my code (apologies if this is not the right place to paste it, first time using github), I first do some handling for the index prices and then do one rolling window without error handling and another with error handling. What my problem might seem to be after all is more likely data related that actually function (selectLSTAR) related so my questions are the following:

Is there something i'm doing wrong in the code?

If its my data that is not suitable for the selectLSTAR function, are there any ways to ?increase? or expand the parameter allowances (if that is understandable) so that the selectLSTAR function manages to do its automatic selection for the failed windows?

---------The code, uploaded as txt.file as .R wasn't supported. Also attached my data. The error handling version might initially fail for the first window on row 65 but running it again from that row lets it run through the working windows and return 0 for the failed.

Thank you so much in advance! Best regards, Alex G

PriceMonthly.xlsx

ForMatthieu.txt

MatthieuStigler commented 7 months ago

thanks for the report!

Could you please provide a minimum self-reproducible code, ideally using reprex::reprex()?

see also: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example

AlexGron commented 7 months ago

ForMatthieuCorrected.txt

Hi again!

Apologies for the earlier attempt, I noticed dplyr was missing from that one, hence running it was not straight-forward. In this version i've retained the error handling to see how it fails along with the plotting for clarity. I kept the data the same as it might provide insight to the potential issue. Regarding reprex, unfortunately due to either the error handling in my code or something else unknown I could not make it work perfectly, so I tested the file on a fresh download of R and Rstudio which worked, so given that you have tsDyn & dplyr downloaded it should run :)

Thanks again for the patience! Best regards, Alex G --------------Some version info R version 4.3.2 (2023-10-31) Platform: aarch64-apple-darwin20 (64-bit) Running under: macOS Sonoma 14.2.1

Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe tzcode source: internal

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] tsDyn_11.0.4 dplyr_1.1.4

loaded via a namespace (and not attached): [1] sandwich_3.0-2 utf8_1.2.4 generics_0.1.3
[4] lattice_0.21-9 magrittr_2.0.3 grid_4.3.2
[7] iterators_1.0.14 foreach_1.5.2 Matrix_1.6-1.1
[10] nnet_7.3-19 deSolve_1.38 forecast_8.21.1
[13] mgcv_1.9-0 fansi_1.0.5 tseriesChaos_0.1-13.1 [16] scales_1.2.1 codetools_0.2-19 mnormt_2.1.1
[19] cli_3.6.1 rlang_1.1.2 munsell_0.5.0
[22] splines_4.3.2 tools_4.3.2 parallel_4.3.2
[25] colorspace_2.1-0 ggplot2_3.4.4 curl_5.1.0
[28] strucchange_1.5-3 vctrs_0.6.4 R6_2.5.1
[31] zoo_1.8-12 lifecycle_1.0.4 tseries_0.10-54
[34] MASS_7.3-60 pkgconfig_2.0.3 urca_1.3-3
[37] pillar_1.9.0 gtable_0.3.4 glue_1.6.2
[40] vars_1.5-9 quantmod_0.4.25 Rcpp_1.0.11
[43] tibble_3.2.1 lmtest_0.9-40 tidyselect_1.2.0
[46] rstudioapi_0.15.0 nlme_3.1-163 xts_0.13.1
[49] timeDate_4022.108 fracdiff_1.5-2 compiler_4.3.2
[52] quadprog_1.5-8 TTR_0.24.3

MatthieuStigler commented 7 months ago

thanks! But I ran it once, and didn't see an error?

Could you make sure to make it minimal, and self-reproducible?

Also, please use reprex::reprex() and paste the output, such as:

log(-2)
#> Warning in log(-2): NaNs produced
#> [1] NaN
log("a")
#> Error in log("a"): non-numeric argument to mathematical function

Created on 2024-02-29 with reprex v2.1.0

Thanks!

AlexGron commented 7 months ago

reprexselectLSTARexample.txt Hi again! Sorry for the delay, here is my best attempt at this! What I'm trying to show here is that the selectLSTAR function does not succeed in finding the possible combinations of values of the hyper-parameters but manually selecting similar with the lstar function works. I am simply confused why the selectLSTAR function does not suggest the values which I used in the lstar function i.e. (m = 2, d = 1, steps = 1, mL = 2, mH = 2, thDelay = 1)? This is as minimally reproducible as I can deliver, hope its enough 👍 Thanks for the help once again! -Alex

The simplified code snippet that reproduces the error

Simplified dataset

PriceMonthly <- c(100, 102, 105, 103, 108, 110, 107, 111, 115)

Calculate log returns

returns_ts <- log(PriceMonthly / lag(PriceMonthly)) returns_ts[1] <- 0 # Handle missing value

Demonstrate the error

selectLSTAR(returns_ts, m = 2)

> Error in seq.default(start.con$gammaInt[1], start.con$gammaInt[2], length.out = start.con$nGamma): 'to' must be a finite number

lstar(returns_ts, m = 2, d = 1, steps = 1, mL = 2, mH = 2, thDelay = 1)

> Performing grid search for starting values...

> Starting values fixed: gamma = 100 , th = 0.03635342 ; SSE = 0.00014281

> Grid search selected lower/upper bound gamma (was: 1 100 ]).

> Might try to widen bound with arg: 'starting.control=list(gammaInt=c(1,200))'

> Optimization algorithm converged

> Optimized values fixed for regime 2 : gamma = 100 , th = 0.0581123 ; SSE = 0.0001092132

>

> Non linear autoregressive model

>

> LSTAR model

> Coefficients:

> Low regime:

> const.L phiL.1 phiL.2

> 0.03758536 -0.34375719 -0.28942433

>

> High regime:

> const.H phiH.1 phiH.2

> -1.921109 -34.335225 50.089471

>

> Smoothing parameter: gamma = 100

>

> Threshold

> Variable: Z(t) = + (0) X(t) + (1) X(t-1)

>

> Value: 0.05811

selectLSTAR is unable to find a fit with the m = 2 specification, but manually fitting with lstar() provides a result.

MatthieuStigler commented 7 months ago

almost there! Not sure why but copy/pasting the output did not preserve the format, see below with output.

So ok, it's a bug indeed, confirmed, thanks for the report. I don't know when I'll be able to look at it though, I'm on sabbatical till April.

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tsDyn)
#> Registered S3 method overwritten by 'quantmod':
#>   method            from
#>   as.zoo.data.frame zoo
# The simplified code snippet that reproduces the error
# Simplified dataset
PriceMonthly <- c(100, 102, 105, 103, 108, 110, 107, 111, 115)

# Calculate log returns
returns_ts <- log(PriceMonthly / lag(PriceMonthly))
returns_ts[1] <- 0  # Handle missing value

# Demonstrate the error
selectLSTAR(returns_ts, m = 2)
#> Error in seq.default(start.con$gammaInt[1], start.con$gammaInt[2], length.out = start.con$nGamma): 'to' must be a finite number
lstar(returns_ts, m = 2, d = 1, steps = 1, mL = 2, mH = 2, thDelay = 1)
#> Performing grid search for starting values...
#> Starting values fixed: gamma =  100 , th =  0.03635342 ; SSE =  0.00014281 
#> Grid search selected lower/upper bound gamma (was:  1 100 ]). 
#>                    Might try to widen bound with arg: 'starting.control=list(gammaInt=c(1,200))'
#> Optimization algorithm converged
#> Optimized values fixed for regime 2  : gamma =  100 , th =  0.0581123 ; SSE =  0.0001092132
#> 
#> Non linear autoregressive model
#> 
#> LSTAR model
#> Coefficients:
#> Low regime:
#>     const.L      phiL.1      phiL.2 
#>  0.03758536 -0.34375719 -0.28942433 
#> 
#> High regime:
#>    const.H     phiH.1     phiH.2 
#>  -1.921109 -34.335225  50.089471 
#> 
#> Smoothing parameter: gamma = 100 
#> 
#> Threshold
#> Variable: Z(t) = + (0) X(t) + (1) X(t-1)
#> 
#> Value: 0.05811
#selectLSTAR is unable to find a fit with the m = 2 specification, but manually fitting with lstar() provides a result.

Created on 2024-03-03 with reprex v2.1.0

AlexGron commented 7 months ago

Hi again,

My follow-up question is then, is there any way for me to circumvent this issue? For example by downloading an older version of the package? Or has this bug always existed in the selectLSTAR function as long as it has been available? For the windows where the function worked, can I assume it returned the optimal hyper-parameters, and for the windows that failed I would need to manually test with the lstar function? I'm merely trying to figure out how to finish this myself as my thesis has to coincidentally be submitted before April 🙂

Thank you once again for the help and taking the time off your sabbatical for clarifying this bug!

Best regards -Alex

MatthieuStigler commented 7 months ago

I don't think the bug was there before, though you could test that.

The best is to try some debugging:

  1. read about debugging tools
  2. find an example where it works and doesn't
  3. use traceback() after the failing call

Step 3 shows that the offending line is (read bottom to top)

5: stop("'to' must be a finite number")
4: seq.default(start.con$gammaInt[1], start.con$gammaInt[2], length.out = start.con$nGamma)
3: seq(start.con$gammaInt[1], start.con$gammaInt[2], length.out = start.con$nGamma)
  1. Look at the source code (in github) to understand what is start.con$gammaInt ? Why/when does this fail and doesn't? Which assumptions were we making?
  2. If you find the issue and a solution, I can easily add this into the development version on github (though pushing to cran will take longer)

thanks!

AlexGron commented 7 months ago

Hi,

I'll start looking into it the best I can. Thanks for the advice! -Alex