Closed MikaelUmaN closed 4 years ago
Thanks a lot for the contribution. I've noticed my lambda
or alpha
setting was not consistent with pandas and I've fixed them.
My ewmMean
also didn't take care of the missing values well. I've incorporated ideas and test cases from your codes. Check the latest change here.
https://github.com/fslaborg/Deedle/pull/475/commits/6821c203a6525ee0dc223b20d3b413bb35cfc770
I also need to fix ewmStdDev
and ewmVariance
as they are a special approximation case for financial return series assuming mean of returns is zero. My implementation is similar to mftsr
package in R. But they are not generalized function for other series that has nonzero mean.
https://github.com/rforge/mftsr/blob/master/pkg/R/ewmaVol.R
BTW, I've renamed ewmStdDev
to ewmVol
and moved other RiskMetrics type of functions to separate finance
module. I think a separate namespace and related functions benefits a specific domain of finance users.
https://github.com/fslaborg/Deedle/pull/475/commits/a2abc29673a947c0defbf555e8b7108991cd26eb
I will try to wrap up matrix dot operations and release a new version soon.
Nice, thanks for incorporating it :).
Yes, dealing with nans is probably good. Often you want to look at timeseries of related markets that may not have the same open hours (or if looking at daily prices some markets may be closed due to holiday etc), so inevitably there will be nans.
In pandas there seem to be two options for how that affects the weighting of past values. The simplest one is "ignore_na"=True (my code). If false, they seem to do (1-alpha)**n where n is the number of missing periods. That is also the default setting. But I couldn't quite get it to work in reproducing their numbers.
Should have to do with this code:
if is_observation or (not ignore_na):
old_wt *= old_wt_factor
if is_observation:
# avoid numerical errors on constant series
if weighted_avg != cur:
weighted_avg = ((old_wt * weighted_avg) +
(new_wt * cur)) / (old_wt + new_wt)
if adjust:
old_wt += new_wt
else:
old_wt = 1.
Have you looked at this setting? I think my prefered default would be for ignore_na to be True but they have reasoned differently starting from v0.15.
On another note, I'd be interested in adding some time series functionality such as ARMA-GARCH modeling.
Do you think that would fit and if so where? It would need to rely on probability distributions and some optimization for setting parameters, so probably dependent on e.g. Math.Net. Hard to draw the line of whether to first contribute there or just keep it here... Because it's again related to financial applications my thought would be to keep the probability distributions and optimization methods in math.net and the time series analysis in Deedle ?
GARCH model would be great. I've published Deedle.Math today. There is a Finance
namespace reserved for cases like GARCH.
In principle, you are right about separating distribution/optimization and actual application. But I wouldn't object building a prototype here first if it takes longer to get pull requests merged into Math.Net repo. Feel free to submit pull requests.
I'll take a closer look at how pandas
handles its missing values when ignore_na
is false and add an optional parameter in ewmMean
too
Hi.
I've seen the idea to add a separate package for Deedle.Math: https://github.com/fslaborg/Deedle/pull/475
This includes support for exponential smoothing which is nice.
I would very much like an expanding version of exponential smoothing much like exists in pandas. This is particularly helpful in financial applications when you want to use all history up to the given point in time.
My commit below just follows: // pandas v0.24.2: series.ewm(alpha=0.97, adjust=False, ignore_na=True).mean()