Open ValentinKaisermayer opened 2 years ago
Right now one can do: ts = ts.coredata |> Impute.locf() |> TS
to carry forward last observations for missing data. Ideally, it would be good to have something like: impute(ts, :colname, LOCF)
.
Nice package!
However, that is not really the functionality I was referring to. I meant changing the time base of the data. Impute.jl
seems to only calculate missing values.
A use-case would be measurement data, but not regularly sampled. For many methods, e.g. time series forecasting via AR, ARIMA, ..., the data needs to be sampled at regular intervals.
If you want I'll make a PR with such a method.
retime(ts, timestamps; upsample=:previous, downsample=:mean)
Notice that this - in most cases - will change the length of the object and hence can not be in-place.
So, the apply()
method is built for doing frequency conversions as well. We earlier had a specific function to do frequency conversions similar to to.freq()
from zoo/xts of the R world but then we realised we don't need it because the way Julia works so it was possible to do it with apply()
itself. See: apply(ts, Dates.Minute(15), x -> mean(skipmissing(x)), last)
I do think it could be a good option to provide a frequency conversion method just for end-user convenience.
Though, apply()
doesn't currently provide a way to upsample but downsampling works. For upsampling, do you think using functionality from Impute.jl and integrating it with apply()
would solve your use case?
In fact, I think a good implementation of frequency conversion would be to have a function to compute endpoints, see: https://rdrr.io/cran/xts/man/endpoints.html. The function outputs a vector which can then be used in frequency conversion function as well as apply as you mention in #43.
I would suggest two options:
retime(ts, timestamps,...)
Dates.Period
, which can be used to generate the endpoints for the first method over the time range of the data. retime(ts, Dates.Minute(15),...)
As a general note:
apply
is so general that it can essentially be used to compute any function over the intervals of the data. However, I would propose to make an extra interface for changing the time base. Tasks that occur often (see #48) should be a single method call, with few options.A method for upsample() was pushed as part of #38. Currently, it only supports adding missing
for in-between missing data. The code is here: src/upsample.jl.
You may be interested by this https://github.com/femtotrader/TimeSeriesResampler.jl
You may be interested by this https://github.com/femtotrader/TimeSeriesResampler.jl
Seem to be unmainted as is the TimeSeries.jl
package.
yep it's just to give an API idea
If you want I'll make a PR with such a method.
@ValentinKaisermayer I had missed this sentence earlier. Please do submit a PR if you can! :)
retime(ts, timestamps; upsample=:previous, downsample=:mean)
retime(ts, Dates.Minute(15),...)
I do like both these methods, though, I prefer only having the second one (Dates.Period
as the second argument). Do we think users would want to supply their own timestamps (:Index
) values to do the sampling? I would assume most users would just tell the period they are looking to resample the object to without caring how the package computes the timestamps.
Also, I would prefer the name resample()
than retime()
only because more people might end up googling for "how to resample timeseries in Julia".
I would like to have both. So the user has full control over if he wants to have a regular or irregular TS back.
I think you're looking for MessyTimeSeries.jl.
I wonder if offsets such as MonthEnd, YearEnd, BusinessMonthBegin... are implemented for resampling timeseries.
Resample example looks odd to me
resampling example should be done with 2 steps
Close price -> weekly resample with OHLC
(ie taking first max min last) -> Open High Low Close
Volume -> weekly resample with sum
as aggregate function -> Volume
I wonder if offsets such as MonthEnd, YearEnd, BusinessMonthBegin... are implemented for resampling timeseries.
You can do this by providing a function to endpoints()
:
julia> endpoints(ts, i -> lastdayofmonth.(i), 1)
Resample example looks odd to me
resampling example should be done with 2 steps Close price -> weekly resample with
OHLC
(ie taking first max min last) -> Open High Low Close Volume -> weekly resample withsum
as aggregate function -> Volume
Where is this example you are referring to?
Resample example looks odd to me
resampling example should be done with 2 steps Close price -> weekly resample with
OHLC
(ie taking first max min last) -> Open High Low Close Volume -> weekly resample withsum
as aggregate function -> Volume
I think this is what xts to.period() also does. The OHLC
parameter is set to TRUE
by default. Yes, this is something missing from the to_period
et al functions in TSFrames and should be incorporated. PRs are welcome. :)
Till then, apply() allows one to provide a function to aggregate values over a period.
I think this is one of the most important methods for time series data. Being able to interpolate and aggregate.
I like the interface of Grafana, i.e. being able to specify not only the interpolation or aggregation method but both at the same time.
Useful if you have measurement data at e.g. about 5min intervals but with some holes in it and want to get a clean vector with an equidistant sample time of 15min. Where there is good data it has to be aggregated and where there are holes it has to be interpolated.
For interpolation, common methods would be
And for aggregation