Closed TwoToFourYears closed 5 years ago
Are you sure?
In [1]: import pandas as pd
...:
...: xx = pd.Series([0,1,2,3,4,5,6,7,8,9])
...: yy = xx.ewm(span=2, adjust=False).mean ()
...: zz = xx.ewm(span=2, adjust=True).mean ()
...:
In [2]: yy
Out[2]:
0 0.000000
1 0.666667
2 1.555556
3 2.518519
4 3.506173
5 4.502058
6 5.500686
7 6.500229
8 7.500076
9 8.500025
dtype: float64
In [3]: zz
Out[3]:
0 0.000000
1 0.750000
2 1.615385
3 2.550000
4 3.520661
5 4.508242
6 5.503202
7 6.501220
8 7.500457
9 8.500169
dtype: float64
You are correct that they give different values. Sorry about not being as careful as I should have.
However, The value for adjust=True are still incorrect. The values returned should be by using a weighted average as specified in the doc page which has the values:
(1-alpha)**(n-1), (1-alpha)**(n-2), ..., 1-alpha, 1.
So for span=2, alpha=2/3, n=2, and the weights are 1/3, 1. The returned values are given by:
zz[i] = (xx[i-1] / 3 + xx[i]) / (4/3)
or
zz[i] = (xx[i-1] + 3*xx[i]) / 4
Or for this example:
[ nan, 3, 7, 11, 15, 17, 21, 25, 29, 33 ] / 4
Or is the intent different than this?
Can you work through the n=2 example again?
The numerator should be 1*2 + 1/3 *1 = 7/3
The denominator should be 1 + 1/3 + 1/9 = 13/9
and that =1.615...
Excuse me, but shouldn't the denominator be the sum of the weights used in the numerator: 1 + 1/3 = 4/3
and not 1 + 1/3 + 1/9
?
More generally, do I have the right idea about the "adjust" parameter? Using Wikipedia as a common reference, we have two possibilities: an expanding window versus a fixed width window. Using xx for the input series, and zz for the output:
Isn't adjust=False
the expanding window:
zz[i] = alpha * xx[i] + (1 - alpha) * zz[i-1]
Or a window to the beginning of the series:
zz[i] = (xx[i] + (1-alpha) * xx[i-1] + ... + (1-alpha)**i * xx[0]) / (sum of weights)
While adust=True
is a fixed width window with n terms
zz[i] = (xx[i] + (1-alpha) * xx[i-1] + ... + (1-alpha)**(n-1) * xx[i - (n-1)]) / (sum of weights)
Final note, I left out that my original hand calculation effectively used min_period=n
.
Excuse me, but shouldn't the denominator be the sum of the weights used in the numerator: 1 + 1/3 = 4/3 and not 1 + 1/3 + 1/9?
The numerator has a 1/9 weight on 0, the first point
Could you rephrase the rest of the question - what's the result you're expecting vs the result you're seeing?
The doc page says:
When adjust is True (default), weighted averages are calculated using weights (1-alpha)(n-1), (1-alpha)(n-2), ..., 1-alpha, 1.
When adjust is False, weighted averages are calculated recursively as: weighted_average[0] = arg[0]; weighted_average[i] = (1-alpha)weighted_average[i-1] + alphaarg[i]
So does this state that adjust=True
uses a fixed width window of span=2
(for the example) and not an expanding window that extends to the beginning?
Or if you go here and scroll down a bit you get:
One must specify precisely one of span, center of mass, half-life and alpha to the EW functions:
- Span corresponds to what is commonly called an “N-day EW moving average”.
- Center of mass has a more physical interpretation and can be thought of in terms of span: c=(s−1)/2c = (s - 1) / 2.
- Half-life is the period of time for the exponential weight to reduce to one half.
- Alpha specifies the smoothing factor directly.
Again stating or at least suggesting the using span
as an argument (at least with adjust=True
) returns an exponentially weighted average on a fixed width window.
Can you make an example with the result you're expecting vs the result you're seeing?
Sorry for the delay.
Let's back up. The question is what is the method supposed to do. Reading the documentation, the method - at least in part - implements the recursive exponent weighted average:
y[t] = alpha * y[t-1] + (1 - alpha) * x[t]
x[t] is the input, y[t] the output. This method, with some variations, results in an expanding window.
An alternative is to use a rolling (or moving) fixed length window. For a window width 2, we get alpha=2/3
, and weights of 1/3, and 1. The rolling window would give:
y[0] = (1 * x[0]) / 1
y[1] = (1 * x[1] + 1/3 * x[0]) / (1 + 1/3)
y[2] = (1 * x[2] + 1/3 * x[1]) / (1 + 1/3)
y[3] = (1 * x[3] + 1/3 * x[2]) / (1 + 1/3)
Here the documentation says:
When adjust is False, weighted averages are calculated recursively as:
weighted_average[0] = arg[0]; weighted_average[i] = (1-alpha)*weighted_average[i-1] + alpha*arg[i].
which is the recursive/expanding window method
However right above this quote, the same doc says:
When adjust is True (default), weighted averages are calculated using weights
(1-alpha)**(n-1), (1-alpha)**(n-2), ..., 1-alpha, 1
.
Further, going here and paging down a couple of times and you get the following explanation of span, half-life, and other terms:
Span corresponds to what is commonly called an “N-day EW moving average”.
This at least suggests that calling the function as: aSeries.ewm (span=2, adjust=True).mean ()
will result in a rolling window as above.
So for x = [ 0, 1, 2, 3, ... ]
we get:
y[0] = 1 * 0 / 1
y[1] = (1 * 1 + 1/3 * 0) / (1 + 1/3)
y[2] = (1 * 2 + 1/3 * 1) / (1 + 1/3)
y[3] = (1 * 3 + 1/3 * 2) / (1 + 1/3)
which is not what is produced
So either the method is not making a rolling (or moving) window calculation, or the documentation is problematic. Rereading the document several times, I tend to think the rolling window calculation is not intended, but rather the documentation is overly fuzzy.
In the interests of expediency, I'm going to jump in at the first point I have a question.
The rolling window would give:
y[0] = (1 * x[0]) / 1 y[1] = (1 * x[1] + 1/3 * x[0]) / (1 + 1/3)
When you say rolling
, do you mean ewm
? If so, should that be 1 * y[0]
rather than 1 * x[1]
? If not, how do you derive the formula? There's no alpha
in a rolling calc, only a window
?
rolling
is the method rolling
or the moving window of mwa
. There are various documents that describe the calculations.
Closing as this appears mostly a usage question. If there's specifics in the documentation that could be improved, a new issue can be opened to address that.
Code Sample, a copy-pastable example if possible
import pandas as pd
xx = pd.Series([0,1,2,3,4,5,6,7,8,9]) yy = xx.ewm(span=2, adjust=False).mean () zz = xx.ewm(span=2, adjust=True).mean ()
Problem description
The ewm returns the same values for adjust=True as for adjust=False. For span=2, alpha=2/3, and the weights should be [1/3, 1], or, equivalently [1, 3].
Expected Output
The result should be [ nan, 3, 7, 11, 15, 17, 21, 25, 29, 33 ] / 4
Output of 0.20.3 and 0.21.1 is
[ 0, 0.75, 1.615385, 2.550000, 3.520661, 4.508242, 5.503202, 6.501220, 7.500457, 8.500169 ] which is the same as adjust=False