joshuaulrich / xts

Extensible time series class that provides uniform handling of many R time series classes by extending zoo.
http://joshuaulrich.github.io/xts/
GNU General Public License v2.0
219 stars 70 forks source link

as.vector performance #406

Closed ethanbsmith closed 9 months ago

ethanbsmith commented 9 months ago

Description

i noticed some odd timings on as.vector() on an xts object

Expected behavior

same performance as as.vector on a matrix

base r will dispatch to this

as.vector.xts <- function(x, mode = "any") {
  return(as.vector(coredata(x), mode = mode))
}

Minimal, reproducible example

z <- getSymbols("SPY", auto.assign = F)
microbenchmark::microbenchmark(as.vector(z), as.vector(coredata(z)), as.numeric(z), as.numeric(coredata(z)))
#Unit: microseconds
#                    expr   min    lq    mean median     uq   max neval
#            as.vector(z) 353.8 359.9 377.669 363.95 373.35 645.4   100
#  as.vector(coredata(z))   9.9  11.1  12.300  11.70  13.55  21.0   100
#           as.numeric(z)   7.9   8.7  10.574   9.15  11.65  77.1   100
# as.numeric(coredata(z))   7.1   7.7   8.670   8.00   9.10  15.3   100
identical(as.vector(z[,4]), as.vector(coredata(z[,4])))
#[1] TRUE

Session Info

R version 4.3.1 (2023-06-16 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 11 x64 (build 22621)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.utf8  LC_CTYPE=English_United States.utf8    LC_MONETARY=English_United States.utf8 LC_NUMERIC=C                           LC_TIME=English_United States.utf8    

time zone: America/Denver
tzcode source: internal

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] kableExtra_1.3.4  jsonlite_1.8.7    readxl_1.4.3      xml2_1.3.5        curl_5.0.2        Rcpp_1.0.11       matrixStats_1.0.0 data.table_1.14.8 doFuture_1.0.0    future_1.33.0     doParallel_1.0.17 iterators_1.0.14 
[13] foreach_1.5.2     quantmod_0.4.25   TTR_0.24.3.1      xts_0.13.1.2      zoo_1.8-12        plotrix_3.8-2   
joshuaulrich commented 9 months ago

This is due to the as.matrix() call in as.vector.zoo():

as.vector.zoo <- function(x, mode = "any") {
    as.vector(as.matrix(x), mode = mode)
}

This function hasn't been altered since it was added in r18 "first version that passes R CMD check" on 2004-10-08. This was before coredata() was introduced.

@zeileis what do you think about changing the as.matrix() call to coredata() in the above function? Then it would be:

as.vector.zoo <- function(x, mode = "any") {
    as.vector(coredata(x), mode = mode)
}

After looking at as.matrix.zoo(), I don't think this would be a breaking change. The first line of that function is y <- as.matrix(coredata(x), ...). The rest of as.matrix.zoo() handles setting dims, colnames, and rownames on the result. But we're just going to call as.vector() on the result, which will drop those attributes anyway. Thoughts?

zeileis commented 9 months ago

Good idea! Thanks for pointing this out Ethan @ethanbsmith and for doing the additional digging Josh @joshuaulrich !

I agree and have just committed the fix to the R-Forge version of zoo, revision 1192.

joshuaulrich commented 9 months ago

Fixed upstream in zoo.