joshuaulrich / TTR

Technical analysis and other functions to construct technical trading rules with R
GNU General Public License v2.0
330 stars 103 forks source link

"Series contains non-leading NAs" on stoch() #113

Open DataStrategist opened 3 years ago

DataStrategist commented 3 years ago

Description

I'm trying to run the stoch function on some data but I'm getting the above referenced error.

Expected behavior

I'm expecting the function to return expected output of 3 columns

Minimal, reproducible example

library(TTR)

hlc <- rep(22646.53, 40)
df <-
  structure(
    list(High =
           c(22643.3, 22615.8, 22620.1, 22594.0, 22561.5, 22562.4,
             22609.4, 22619.0, 22654.6, 22659.0, 22661.5, hlc),
         Low =
           c(22582.5, 22538.7, 22550.0, 22540.1, 22520.1, 22515.7,
             22524.2, 22589.5, 22578.5, 22585.0, 22633.3, hlc),
         Close =
           c(22610.4, 22561.4, 22592.9, 22553.8, 22545.3, 22524.8,
             22603.8, 22613.9, 22629.7, 22644.1, 22646.5, hlc)),
      row.names = 42550:42600, class = "data.frame")

stoch(df)
#> Error in runSum(x, n): Series contains non-leading NAs

Session Info

R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252   
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] quantmod_0.4.18 xts_0.12.1      TTR_0.24.2      zoo_1.8-8       binancer_1.1.2 
 [6] forcats_0.5.0   stringr_1.4.0   dplyr_1.0.2     purrr_0.3.4     readr_1.3.1    
[11] tidyr_1.1.1     tibble_3.1.0    ggplot2_3.3.2   tidyverse_1.3.0

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.5        lubridate_1.7.9   lattice_0.20-41   listenv_0.8.0    
 [5] ps_1.6.0          assertthat_0.2.1  digest_0.6.27     utf8_1.1.4       
 [9] R6_2.5.0          cellranger_1.1.0  backports_1.1.10  reprex_0.3.0     
[13] evaluate_0.14     httr_1.4.2        pillar_1.5.0      rlang_0.4.10     
[17] curl_4.3          readxl_1.3.1      rstudioapi_0.13   data.table_1.13.0
[21] callr_3.5.1       whisker_0.4       furrr_0.1.0       blob_1.2.1       
[25] rmarkdown_2.5     munsell_0.5.0     broom_0.7.0       xfun_0.18        
[29] compiler_4.0.2    modelr_0.1.8      pkgconfig_2.0.3   clipr_0.7.1      
[33] htmltools_0.5.0   globals_0.13.0    tidyselect_1.1.0  codetools_0.2-16 
[37] fansi_0.4.2       future_1.19.1     crayon_1.4.1      dbplyr_1.4.4     
[41] withr_2.4.1       grid_4.0.2        jsonlite_1.7.2    gtable_0.3.0     
[45] lifecycle_1.0.0   DBI_1.1.0         magrittr_2.0.1    scales_1.1.1     
[49] cli_2.3.1         stringi_1.5.3     fs_1.5.0          snakecase_0.11.0 
[53] xml2_1.3.2        logger_0.1        ellipsis_0.3.1    generics_0.1.0   
[57] vctrs_0.3.6       tools_4.0.2       glue_1.4.2        hms_0.5.3        
[61] processx_3.4.5    parallel_4.0.2    yaml_2.2.1        colorspace_2.0-0 
[65] rvest_0.3.6       knitr_1.29        haven_2.3.1   
DataStrategist commented 3 years ago

Found it. It freaks out if High == Low == Close. Is that expected behavior?

braverock commented 3 years ago

Yes, that is expected behavior. There is no volatility, so the math doesn't work. You get div by zero.

DataStrategist commented 3 years ago

Fair enough. May I make a gentle recommendation as a package builder to "capture that error" perhaps using try or purrr::safely and then output a more human readable error using stop? But otherwise, it's cool... anyway these strategies don't really work on the 1 min candlesticks (which is where you can get the exactly the same values). :)

You may close if you wish.

braverock commented 3 years ago

These technical indicators were developed on daily bars.

I think NA is the appropriate result. how you choose to handle those NA's is up to you. It seems that there are multiple different arguably reasonable responses to an indicator returning an NA. You as an analyst need to decide what the proper handling is for the analysis or strategy that you are trying to create.

Thanks for the report, closing.

joshuaulrich commented 3 years ago

There is no volatility, so the math doesn't work. You get div by zero.

While that's true, stoch() tries to account for it. It sets fastK to 0.5 if it's not NA or not finite. I did that to fix #52, but it only works for Inf, not NaN. This is a bug that should be fixed.

At minimum, I agree with @DataStrategist that the error should be more informative.

DataStrategist commented 3 years ago

Yeah, I was gonna let it rest because it doesn't really matter, but the reason Error in runSum(x, n): Series contains non-leading NAs is technically incorrect is that in my original reprex there are no NAs.

Anyway, thanks for a great package, and good luck trading!

joshuaulrich commented 3 years ago

The error is because the fastK calculations creates NAs, not that they're in your original series. That's why the error needs to be more informative... or the case should be handled automatically by setting Inf and NaN to 0.5.

I may add a warning to let the user know that their series isn't well-suited to using stoch() because of its invariant characteristics. If I do, I would also add a warning = TRUE arg you could set to FALSE to suppress it.

GitHubGeniusOverlord commented 1 year ago

Hello, I think the same error appears also with the EMV() function. So everything said above may apply to that function too. Let me add my observation, that the equality in high, low, close, close_adj appears on days, when the asset simply was not traded at all.