joshuaulrich / quantmod

Quantitative Financial Modelling Framework
http://www.quantmod.com/
GNU General Public License v3.0
814 stars 224 forks source link

getSymbols date limit problem #354

Closed espher1987 closed 1 year ago

espher1987 commented 2 years ago

Description

I'm trying to replicate an example from Ang, C. S. (2015). Analyzing financial data and implementing financial models using R. Springer, Retrieving yahoo finance data directly using getsymbols page 11, where i need to use from = and to= arguments. Using from = 2010-12-31" and end = "2013-12-31"

Expected behavior

i expect data starting on "2010-12-31" and ending at "2013-12-31", using the same code used by Ang (2015).

Minimal, reproducible example

library(quantmod)
#> Loading required package: xts
#> Loading required package: zoo
#> 
#> Attaching package: 'zoo'
#> The following objects are masked from 'package:base':
#> 
#>     as.Date, as.Date.numeric
#> Loading required package: TTR
#> Registered S3 method overwritten by 'quantmod':
#>   method            from
#>   as.zoo.data.frame zoo

df <- getSymbols(Symbols = "AMZN",
                 from = "2010-12-31",
                 to = "2013-12-31",
                 auto.assign = F)
#> 'getSymbols' currently uses auto.assign=TRUE by default, but will
#> use auto.assign=FALSE in 0.5-0. You will still be able to use
#> 'loadSymbols' to automatically load data. getOption("getSymbols.env")
#> and getOption("getSymbols.auto.assign") will still be checked for
#> alternate defaults.
#> 
#> This message is shown once per session and may be disabled by setting 
#> options("getSymbols.warning4.0"=FALSE). See ?getSymbols for details.
summary(df)
#>      Index              AMZN.Open       AMZN.High        AMZN.Low    
#>  Min.   :2010-12-31   Min.   :161.2   Min.   :163.5   Min.   :160.6  
#>  1st Qu.:2011-09-29   1st Qu.:192.8   1st Qu.:195.3   1st Qu.:190.2  
#>  Median :2012-06-28   Median :226.5   Median :230.6   Median :224.6  
#>  Mean   :2012-06-30   Mean   :238.0   Mean   :240.8   Mean   :235.0  
#>  3rd Qu.:2013-04-02   3rd Qu.:266.6   3rd Qu.:269.3   3rd Qu.:263.7  
#>  Max.   :2013-12-30   Max.   :404.6   Max.   :405.6   Max.   :399.2  
#>    AMZN.Close     AMZN.Volume       AMZN.Adjusted  
#>  Min.   :161.0   Min.   :  984400   Min.   :161.0  
#>  1st Qu.:193.3   1st Qu.: 2662775   1st Qu.:193.3  
#>  Median :227.2   Median : 3707050   Median :227.2  
#>  Mean   :238.1   Mean   : 4322605   Mean   :238.1  
#>  3rd Qu.:266.4   3rd Qu.: 5162025   3rd Qu.:266.4  
#>  Max.   :404.4   Max.   :24134200   Max.   :404.4

Created on 2022-01-29 by the reprex package (v1.0.0)

Session Info

sessionInfo()
#> R version 4.0.4 (2021-02-15)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Debian GNU/Linux 11 (bullseye)
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
#> 
#> locale:
#>  [1] LC_CTYPE=es_NI.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=es_NI.UTF-8        LC_COLLATE=es_NI.UTF-8    
#>  [5] LC_MONETARY=es_NI.UTF-8    LC_MESSAGES=es_NI.UTF-8   
#>  [7] LC_PAPER=es_NI.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=es_NI.UTF-8 LC_IDENTIFICATION=C       
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] quantmod_0.4.18 TTR_0.24.3      xts_0.12.1      zoo_1.8-9      
#> 
#> loaded via a namespace (and not attached):
#>  [1] lattice_0.20-41   digest_0.6.27     assertthat_0.2.1  grid_4.0.4       
#>  [5] magrittr_2.0.1    reprex_1.0.0      evaluate_0.14     highr_0.8        
#>  [9] stringi_1.5.3     rlang_0.4.10      cli_2.3.0         curl_4.3.2       
#> [13] rstudioapi_0.13   fs_1.5.0          rmarkdown_2.6     tools_4.0.4      
#> [17] stringr_1.4.0     glue_1.4.2        xfun_0.20         yaml_2.2.1       
#> [21] compiler_4.0.4    htmltools_0.5.1.1 knitr_1.31

Created on 2022-01-29 by the reprex package (v1.0.0)

pverspeelt commented 2 years ago

Related to #258.

@espher1987, add 1 day to your retrieval will give you the data you want:

amzn <- getSymbols(Symbols = "AMZN",
                   from = "2010-12-31",
                   to = "2014-01-01",
                   auto.assign = FALSE)
billelev commented 1 year ago

I have noticed a similar issue. I typically do not specify a "to" date, assuming the most recent close date will be returned. This does not seem to be the case.

e.g.

> p <- getSymbols(Symbols = "^GSPC", auto.assign = FALSE)
> tail(p)
           GSPC.Open GSPC.High GSPC.Low GSPC.Close GSPC.Volume GSPC.Adjusted
2022-10-20   3689.05   3736.00  3656.44    3665.78  4496620000       3665.78
2022-10-21   3657.10   3757.89  3647.42    3752.75  5078020000       3752.75
2022-10-24   3762.01   3810.74  3741.65    3797.34  4747930000       3797.34
2022-10-25   3799.44   3862.85  3799.44    3859.11  4843120000       3859.11
2022-10-26   3825.97   3886.15  3824.07    3830.60  4817310000       3830.60
2022-10-27   3834.69   3859.95  3803.79    3807.30  4687320000       3807.30
>
> p <- getSymbols(Symbols = "^GSPC", auto.assign = FALSE, to = Sys.Date())
> tail(p)
           GSPC.Open GSPC.High GSPC.Low GSPC.Close GSPC.Volume GSPC.Adjusted
2022-10-20   3689.05   3736.00  3656.44    3665.78  4496620000       3665.78
2022-10-21   3657.10   3757.89  3647.42    3752.75  5078020000       3752.75
2022-10-24   3762.01   3810.74  3741.65    3797.34  4747930000       3797.34
2022-10-25   3799.44   3862.85  3799.44    3859.11  4843120000       3859.11
2022-10-26   3825.97   3886.15  3824.07    3830.60  4817310000       3830.60
2022-10-27   3834.69   3859.95  3803.79    3807.30  4687320000       3807.30
>
> p <- getSymbols(Symbols = "^GSPC", auto.assign = FALSE, to = Sys.Date() + 1)
> tail(p)
           GSPC.Open GSPC.High GSPC.Low GSPC.Close GSPC.Volume GSPC.Adjusted
2022-10-21   3657.10   3757.89  3647.42    3752.75  5078020000       3752.75
2022-10-24   3762.01   3810.74  3741.65    3797.34  4747930000       3797.34
2022-10-25   3799.44   3862.85  3799.44    3859.11  4843120000       3859.11
2022-10-26   3825.97   3886.15  3824.07    3830.60  4817310000       3830.60
2022-10-27   3834.69   3859.95  3803.79    3807.30  4687320000       3807.30
2022-10-28   3808.26   3905.42  3808.26    3901.06  4459410000       3901.06
>
> Sys.Date()
[1] "2022-10-28"

I see a few related open issues. Is there a fix, beyond always adding to = Sys.Date() + 1 for safety?

joshuaulrich commented 1 year ago

What's your sessionInfo()?