Open ckatsulis opened 5 years ago
period.apply()
is essentially a convenience wrapper around sapply()
, so it's not particularly performant. There's a faster unexported C-based version that needs some testing before I'm comfortable exporting it and/or replacing period.apply()
with it (see https://github.com/joshuaulrich/xts/commit/7c29359e8f6a2e86af73bdec7c52bc02b6127323).
n <- 1e6L
ep <- c(0L, seq(1L, n, 3L))
x <- .xts(runif(n, 1, n), 1:n)
system.time(test1 <- xts:::period_apply(x, ep, sum))
# user system elapsed
# 2.104 0.044 2.149
system.time(test2 <- period.apply(coredata(x), ep, sum))
# user system elapsed
# 0.705 0.019 0.725
identical(coredata(test1), test2)
# [1] TRUE
I may be able to use some ideas from simplify2array()
to make it even faster. But testing... it needs much testing.
Cool. Thx Josh!
Chris Sent from my iPhone
On Nov 3, 2018, at 07:12, Joshua Ulrich notifications@github.com wrote:
period.apply() is essentially a convenience wrapper around sapply(), so it's not particularly performant. There's a faster unexported C-based version that needs some testing before I'm comfortable exporting it and/or replacing period.apply() with it (see 7c29359).
n <- 1e6L ep <- c(0L, seq(1L, n, 3L)) x <- .xts(runif(n, 1, n), 1:n) system.time(test1 <- xts:::period_apply(x, ep, sum))
user system elapsed
2.104 0.044 2.149
system.time(test2 <- period.apply(coredata(x), ep, sum))
user system elapsed
0.705 0.019 0.725
identical(coredata(test1), test2)
[1] TRUE
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.
I've been able to go faster using data.table and converting the results back to an xts:
library(xts)
n <- 1e6L
ep <- c(0L, seq(1L, n, 3L))
x <- .xts(runif(n, 1, n), 1:n)
system.time(test1 <- xts:::period_apply(x, ep, sum))
user system elapsed
1.25 0.00 1.25
system.time(test2 <- period.apply(coredata(x), ep, sum))
user system elapsed
0.47 0.00 0.46
identical(coredata(test1), test2)
[1] TRUE
period.sumDT <- function(x, INDEX)
{
idx <- findInterval(index(x), index(x)[INDEX], left.open = T)
dt <- data.table::data.table(idx=idx, data=coredata(x)[,1])
dt <- as.matrix(dt[, sum(data, na.rm=T), by=idx])
idx <- dt[,1] < length(INDEX)
setNames( xts(dt[idx,2], index(x)[INDEX][dt[idx,1]+1]), colnames(x))
}
system.time(test3 <- period.sumDT(x, ep ))
user system elapsed
0.07 0.00 0.08
all.equal(test1, test3)
[1] TRUE
> period_apply <- xts:::period_apply # not exported
> set.seed(21)
> x <- .xts(rnorm(1e7), 1:1e7)
> e <- endpoints(x, "seconds", 5)
> system.time(y <- period.apply(x, e, sum)) # current version
user system elapsed
49.17 0.03 49.20
> system.time(z <- period_apply(x, e, sum)) # new C version
user system elapsed
8.14 0.09 8.33
> all.equal(y, z)
[1] TRUE
period.sumDT <- function(x, INDEX)
{
idx <- findInterval(index(x), index(x)[INDEX], left.open = T)
dt <- data.table::data.table(idx=idx, data=coredata(x)[,1])
dt <- as.matrix(dt[, sum(data, na.rm=T), by=idx])
idx <- dt[,1] < length(INDEX)
setNames( xts(dt[idx,2], index(x)[INDEX][dt[idx,1]+1]), colnames(x))
}
> system.time(test3 <- period.sumDT(x, e ))
user system elapsed
0.31 0.11 0.42
> all.equal(y, test3)
[1] TRUE
I appreciate your comment and desire to make xts even faster. However, your code is over-optimized to this specific example.
Here are the issues that are immediately clear to me:
period.apply()
needs to accept any function, not only sum()
I'm also very unlikely to change the existing period.apply()
function until we have adequate unit test coverage.
I also just checked my suspicion that data.table optimizes the call to sum()
by avoiding any R evaluations and instead doing all the summing at the C level. So the performance improvement your example shows would not generalize to most other functions.
I see show improvement for other functions as well. It seems to be faster in any instance where FUN operates on an argument x that is the coredata of each interval. It could be useful as an alternative implementation of period.sum, period.min/max. I also included an example comparing to period.sum, which does not allow x to have more than one column.
This example would not work if the user needed the x argument to FUN to be an xts object, with an index attribute. It is not documented in period.apply if FUN can assume it gets an xts object, or even that x is an xts. You have an example using letters where x is not an xts object .
> library(xts)
> n <- 1e6L
> x <- .xts(runif(n, 1, n), 1:n)
> e <- endpoints(x, "seconds", 5)
> # not exported
> period_apply <- xts:::period_apply
> # period.apply using datatable
> period.apply.DT <- function(x, INDEX, FUN, ...)
+ {
+ dt <- data.table::as.data.table(x)
+ dt$period <- as.numeric(dt$index[INDEX][findInterval(dt$index, dt$index[INDEX], left.open = T)+1])
+ dt <- as.matrix(dt[, lapply(.SD, FUN, ...), by='period', .SDcols = !'index'])
+ setNames(xts(dt[,-1], structure(dt[,1], class = c("POSIXct", "POSIXt"), tz='')), colnames(x))
+ }
> # period.sum using datatable
> period.sum.DT <- function(x, INDEX, na.rm=F)
+ {
+ dt <- data.table::as.data.table(x)
+ dt$period <- as.numeric(dt$index[INDEX][findInterval(dt$index, dt$index[INDEX], left.open = T)+1])
+ dt <- as.matrix(dt[, lapply(.SD, sum, na.rm=na.rm), by='period', .SDcols = !'index'])
+ setNames(xts(dt[,-1], structure(dt[,1], class = c("POSIXct", "POSIXt"), tz='')), colnames(x))
+ }
> #
> my_check <- function(values) { all(sapply(values[-1], function(x) all.equal(values[[1]], x))) }
> # period.apply with FUN=sum
> microbenchmark::microbenchmark(period.apply(x, e, sum),
+ period_apply(x, e, sum),
+ period.apply.DT(x, e, sum),
+ times=3, check=my_check, unit = 's')
Unit: seconds
expr min lq mean median uq max neval
period.apply(x, e, sum) 4.2666698 4.2768328 4.2851441 4.2869959 4.2943812 4.3017665 3
period_apply(x, e, sum) 0.7217397 0.7342956 0.7661842 0.7468515 0.7884065 0.8299614 3
period.apply.DT(x, e, sum) 0.1849304 0.1883792 0.1928879 0.1918279 0.1968667 0.2019055 3
> # period.sum
> microbenchmark::microbenchmark(period.sum(x, e),
+ period.sum.DT(x, e),
+ times=3, check=my_check, unit = 's')
Unit: seconds
expr min lq mean median uq max neval
period.sum(x, e) 4.2866287 4.4034498 4.4782497 4.520271 4.5740602 4.6278495 3
period.sum.DT(x, e) 0.1248247 0.1333578 0.1644498 0.141891 0.1842624 0.2266339 3
> # period.apply with FUN=min
> microbenchmark::microbenchmark(period.apply(x, e, min),
+ period_apply(x, e, min),
+ period.apply.DT(x, e, min),
+ times=3, check=my_check, unit = 's')
Unit: seconds
expr min lq mean median uq max neval
period.apply(x, e, min) 4.2764956 4.3273872 4.3512772 4.3782788 4.3886679 4.3990571 3
period_apply(x, e, min) 0.5536805 0.5877314 0.6010733 0.6217822 0.6247697 0.6277571 3
period.apply.DT(x, e, min) 0.1632318 0.1833229 0.1949875 0.2034139 0.2108654 0.2183168 3
> # period.apply with arbitrary function
> microbenchmark::microbenchmark(period.apply(x, e, function(x) { sqrt(prod(x)) + min(x) }),
+ period_apply(x, e, function(x) { sqrt(prod(x)) + min(x) }),
+ period.apply.DT(x, e, function(x) { sqrt(prod(x)) + min(x) }),
+ times=3, check=my_check, unit = 's')
Unit: seconds
expr min lq mean median uq max neval
period.apply(x, e, function(x) { sqrt(prod(x)) + min(x) }) 4.5865064 4.6597227 4.7162952 4.7329391 4.7811896 4.8294402 3
period_apply(x, e, function(x) { sqrt(prod(x)) + min(x) }) 1.1638887 1.1998162 1.2164511 1.2357438 1.2427324 1.2497210 3
period.apply.DT(x, e, function(x) { sqrt(prod(x)) + min(x) }) 0.2706527 0.2719362 0.2981492 0.2732197 0.3118975 0.3505753 3
Thanks for the notes about period.sum()
and others. That slowdown was due to an unnecessary as.matrix()
call. The timings are more in xts' favor once those are removed:
> microbenchmark::microbenchmark(period.sum(x, e),
+ period.sum.DT(x, e),
+ times=3, check=my_check)
Unit: milliseconds
expr min lq mean median uq max neval
period.sum(x, e) 11.41976 11.47986 17.24279 11.53996 20.15431 28.76866 3
period.sum.DT(x, e) 59.08953 61.27357 64.23368 63.45761 66.80575 70.15389 3
This example would not work if the user needed the x argument to FUN to be an xts object, with an index attribute. It is not documented in period.apply if FUN can assume it gets an xts object, or even that x is an xts. You have an example using letters where x is not an xts object.
You're correct that it isn't documented that FUN
may expect x
to be an xts object. But period.apply()
calls try.xts()
which attempts to convert x
to xts, if possible. Practically, it is very likely that FUN
expects an object similar to the object passed to period.apply()
. Unfortunately, in your faster function that is only the case when the object passed to period.apply()
is a data.table.
I'm very surprised at the data.table performance for the arbitrary function. It is faster than simply calling the function 200,000 times (the length of endpoints). I don't feel good about that.
d <- x[1:4]
f <- function(x) { sqrt(prod(x)) + min(x) }
system.time(for(i in seq_len(length(e))) f(d))
# user system elapsed
# 0.576 0.000 0.577
system.time(period.apply.DT(x, e, f))
# user system elapsed
# 0.34 0.00 0.34
Will the new version have that as.matrix call removed?
On Wed, Dec 19, 2018 at 7:22 AM Joshua Ulrich notifications@github.com wrote:
Thanks for the notes about period.sum() and others. That slowdown was due to an unnecessary as.matrix() call. The timings are more in xts' favor once those are removed:
microbenchmark::microbenchmark(period.sum(x, e),+ period.sum.DT(x, e),+ times=3, check=my_check)Unit: milliseconds expr min lq mean median uq max neval period.sum(x, e) 11.41976 11.47986 17.24279 11.53996 20.15431 28.76866 3 period.sum.DT(x, e) 59.08953 61.27357 64.23368 63.45761 66.80575 70.15389 3
This example would not work if the user needed the x argument to FUN to be an xts object, with an index attribute. It is not documented in period.apply if FUN can assume it gets an xts object, or even that x is an xts. You have an example using letters where x is not an xts object.
You're correct that it isn't documented that FUN may expect x to be an xts object. But period.apply() calls try.xts() which attempts to convert x to xts, if possible. Practically, it is very likely that FUN expects an object similar to the object passed to period.apply(). Unfortunately, in your faster function that is only the case when the object passed to period.apply() is a data.table.
I'm very surprised at the data.table performance for the arbitrary function. It is faster than simply calling the function 200,000 times (the length of endpoints). I don't feel good about that.
f <- function(x) { sqrt(prod(x)) + min(x) }R> system.time(for(i in seq_len(length(e))) f(d)) user system elapsed 0.576 0.000 0.577 R> system.time(period.apply.DT(x, e, f)) user system elapsed 0.34 0.00 0.34
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/joshuaulrich/xts/issues/278#issuecomment-448594975, or mute the thread https://github.com/notifications/unsubscribe-auth/AHKwqvDDTcdtL9s7Gw6sfAT_6mn8ZMYFks5u6j2PgaJpZM4YMQ5d .
I can not repeat this test. What is 'd?'
system.time(for(i in seq_len(length(e))) f(d))
@AndreMikulec d
is the first 4 rows of x
. Sorry for the confusion.
d <- x[1:4]
system.time(for(i in seq_len(length(e))) f(d))
# user system elapsed
# 0.543 0.000 0.544
system.time(period.apply.DT(x, e, f))
# user system elapsed
# 0.31 0.00 0.31
I do not know.I am wild guessing. Possibly multithreading is occuring.
From https://github.com/Rdatatable/data.table/blob/d3ccd8139d3d592201e0f085c5f37a4fc5a426e3/NEWS.0.md
Contents
Following gsum and gmean, now gmin and gmax from GForce are also implemented.
Closes part of #523. Benchmarks are also provided.
R DT[, list(sum(x), min(y), max(z), .N), by=...] # runs by default using GForce
From https://github.com/Rdatatable/data.table/blob/master/src/gsumm.c
Contents
. . .
SEXP gforce(SEXP env, SEXP jsub, SEXP o, SEXP f, SEXP l, SEXP irowsArg)
. . .
#pragma omp parallel for num_threads(getDTthreads())
. . .
SEXP ans = PROTECT( eval(jsub, env) );
Benchmarks here are interesting. From https://github.com/Rdatatable/data.table/issues/523
Description
on large xts objects +1MM lines, period.apply is much faster using core data and recasting
Expected behavior
about equal runtimes with and without a numeric index
Minimal, reproducible example
Session Info