moving is really, really slow

milktrader commented 10 years ago

julia> using TimeSeries, MarketData

julia> @time moving(Cl, mean, 10)[1]
elapsed time: 5.308236902 seconds (2078988368 bytes allocated)
1x1 TimeArray{Float64,2} 1950-01-16 to 1950-01-16

             Close
1950-01-16 | 16.88

Here is the moving algorithm for quick reference:

 41 function moving{T,N}(ta::TimeArray{T,N}, f::Function, window::Int)
 42     tstamps = ta.timestamp[window:end]
 43 #    nanarray = fill(NaN, window-1) # design choice here
 44     vals = T[]
 45     for i=1:length(ta)-(window-1)
 46       push!(vals, f([ta.values[t] for t in 1:length(ta)][i:i+(window-1)]))
 47     end
 48     TimeArray(tstamps, vals, ta.colnames)
 49 end

milktrader commented 10 years ago

Here is some interesting results using the Iterators package and the @carljv algorithm

julia> using TimeSeries, MarketData, Iterators

julia> function fastmoving{T}(ta::TimeArray{T,1}, f::Function, window::Int)
         tstamps = timestamp(ta)[window:length(ta)]
         vals     = float(collect(imap(f, partition(values(ta), window, 1))))
         TimeArray(tstamps, vals, ["fastma"])
       end
fastmoving (generic function with 1 method)

julia> @time fastmoving(Cl, mean, 10)[1]
elapsed time: 0.068698868 seconds (20356984 bytes allocated)
1x1 TimeArray{Float64,2} 1950-01-16 to 1950-01-16

            fastma
1950-01-16 | 16.88

milktrader commented 10 years ago

There is a dearth of methods unfortunately defined using this approach.

julia> @time moving(Cl, var, 10)[16093]
elapsed time: 4.670996873 seconds (2078988400 bytes allocated)
1x1 TimeArray{Float64,2} 2013-12-30 to 2013-12-30

             Close
2013-12-30 | 495.61

julia> @time fastmoving(Cl, var, 10)[16093]
ERROR: no method var((Float64,Float64,Float64,Float64,Float64,Float64,Float64,Float64,Float64,Float64))
 in next at /Users/Administrator/.julia/v0.3/Iterators/src/Iterators.jl:447
 in collect at array.jl:263
 in collect at array.jl:270
 in fastmoving at none:3

quinnj commented 10 years ago

How about the following?

dt1 = date(1980,1,1)
dt2 = date(2015,1,1)
d = [dt1:dt2];
t = TimeArray(d,rand(length(d)),["test"])
@time moving(t, mean, 10)[1]

function moving1{T,N}(ta::TimeArray{T,N}, f::Function, window::Int)
    tstamps = ta.timestamp[window:end]
#    nanarray = fill(NaN, window-1) # design choice here
    vals = zeros(length(ta)-(window-1))
    for i=1:length(vals)
      vals[i] = f(ta.values[i:i+(window-1)])
    end
    TimeArray(tstamps, vals, ta.colnames)
end
@time moving1(t, mean, 10)[1]

Results:

In  [13]: @time moving(t, mean, 10)[1]
elapsed time: 1.565445724 seconds (1311521696 bytes allocated)
In  [17]: @time moving1(t, mean, 10)[1]
elapsed time: 0.004992856 seconds (4112408 bytes allocated)

milktrader commented 10 years ago

Whoa, that's awesome. Yeah, I've been showing the second and onward run with @time to account for the initial compile, so I'm comparing moving to your line 17 result, which is impressive and is getting to Pandas speed.

And this was all done by initializing an array of zeros versus push! ing into an empty array?

milktrader commented 10 years ago

@karbarcca that's really awesome, thanks! Nearly 400x performance boost and no need to import another package. I'm not sure why push! is so much more expensive than replacing zeros.

milktrader commented 10 years ago

I'll just take this code and replace the existing moving with it, saving you some time if you don't mind.

carljv commented 10 years ago

There's nice explanation of growing objects vs. allocating them in the second ring of The R Inferno (http://www.burns-stat.com/pages/Tutor/R_inferno.pdf). Depending on the structure, growth may cheaper because extra space is allocated to account for it (so it's expensive occasionally, but on average cheap). But pre-allocating should always be preferred when feasible.

milktrader commented 10 years ago

I remember reading that tome when it was recommended to me by Patrick. Thanks for the reference. I guess it didn't sink in. I'm a bit surprised still by the performance boost. added karbarcca algorithm for moving for 400x improvement a171f0adade8e753337b2b2ec07ea7b10f87aa61 has passing tests, closes the issue. Thanks for the input!

milktrader commented 10 years ago

For posterity, this result cannot be duplicated in R (afaik) and probably at least matches Pandas

julia> using StatsBase

julia> @time moving(Cl, kurtosis, 10)[16090]
elapsed time: 0.016054877 seconds (4908760 bytes allocated)
1x1 TimeArray{Float64,2} 2013-12-24 to 2013-12-24

             Close
2013-12-24 | -1.53

At this rate, nobody living on the moon will be using R.

quinnj commented 10 years ago

The real reason for the slowness in this case (though pre-allocation is always better than push!), was this line:

push!(vals, f([ta.values[t] for t in 1:length(ta)][i:i+(window-1)]))

And it's not the push!. ta.values[t] for t in 1:length(ta)], I'm not sure how <= that snuck in the code, but for every loop, you're manually copying the entire TimeArray, then doing another copy with the indexing code [i:i+(window-1)] (since indexing currently returns a copy). This turns your O(N window) algorithm into O(N^2 window).

quinnj commented 10 years ago

The other big factor here is the copying on indexing, which I mentioned. While this has been all but decided to change in Base, it's taking a while for things to actually get moving. In the mean time, @lindahua has the excellent ArrayViews package that could help us out here. For example, check out this new implementation of upto using array views instead of Base indexing

In  [45]: using ArrayViews
function upto1{T,N}(ta::TimeArray{T,N}, f::Function) 
    len = length(ta)
    vals = zeros(len)
      for i=1:len
        vals[i] = f(view(ta.values,1:i))
      end
    TimeArray(ta.timestamp, vals, ta.colnames)
end

Out [45]: upto1 (generic function with 1 method)

In  [46]: @time upto(t, mean)[1]
elapsed time: 0.770675349 seconds (656945376 bytes allocated)
1x1 TimeArray{Float64,2} 1980-01-01 to 1980-01-01

Out [46]:              test  1980-01-01 | 0.75

In  [47]: @time upto(t, mean)[1]
elapsed time: 0.717450052 seconds (656945376 bytes allocated)
1x1 TimeArray{Float64,2} 1980-01-01 to 1980-01-01

Out [47]:              test  1980-01-01 | 0.75

In  [48]: @time upto1(t, mean)[1]
elapsed time: 0.096905759 seconds (3787048 bytes allocated)
1x1 TimeArray{Float64,2} 1980-01-01 to 1980-01-01

Out [48]:              test  1980-01-01 | 0.75

In  [49]: @time upto1(t, mean)[1]
elapsed time: 0.061944166 seconds (3697296 bytes allocated)
1x1 TimeArray{Float64,2} 1980-01-01 to 1980-01-01

Out [49]:              test  1980-01-01 | 0.75

milktrader commented 10 years ago

Very nice.

We could 1) create arrayviews branch that adds ArrayViews and implements the view method, waiting for Base to implement the changes or 2) just make master do this now.

I'm open to either option. If we go with making the changes to master, we could always remove the ArrayViews dependency once Base catches up.

quinnj commented 10 years ago

I would vote on making a dependency on ArrayViews in master. It may or may not take a while for Base to catch up, but it's pretty easy to take out once it does. In the meantime, I'm sure @lindahua would appreciate some use of the package out in the wild.

milktrader commented 10 years ago

Okay, sounds like we go with option 2 then. I'll hack around in a branch and then PR it for merge since you mentioned you're busy lately.

JuliaStats / TimeSeries.jl

moving is really, really slow #56