luke14free / pm-prophet

GAM timeseries modeling with auto-changepoint detection. Inspired by Facebook Prophet and implemented in PyMC3
327 stars 42 forks source link

fourier_series coefficients #29

Open BjorHart opened 4 years ago

BjorHart commented 4 years ago

Hi! I have been working with the source code to pm-prophet for a while, and I am wondering about how the fourier-series and the accompanying coefficients a1,b1,a2,b2,....,an,bn in the standard fourier series are sampled. As far as I can tell, the _fourierseries() function returns both sine and cosine series with length of input data t, but self.data["f%s_%s"%(seasonality,orderidx)] only uses the sine series. As an example, a fourier_series of order 4 should have 8 self.priors["seasonality"] -RVs but has only 4.

How is this fourier series modelled? Thank you in advance!

luke14free commented 4 years ago

This seems like a real bug, well spotted. Investigating as it might be a major improver of the quality of the seasonality fitter

BjorHart commented 4 years ago

thank you for answering, did you take the time to look at the source code ? Do you agree that it is a bug?

luke14free commented 4 years ago

Yes, I just verified it :) working on a fix

luke14free commented 4 years ago
Screenshot 2020-07-31 at 16 27 45

very happy that this also solves an issue with multiplicative seasonality not behaving as expected, adding a test for this :)

BjorHart commented 4 years ago

how was the previous result in comparison?

luke14free commented 4 years ago

So the issue should be resolved on master (I didn't deploy the pakcage to pypy yet though). I am running this code for testing this:

    z = np.sin(np.linspace(0, 200, 200)) * np.linspace(0, 200, 200)
    df = pd.DataFrame()
    df["ds"] = pd.date_range(start="2018-01-01", periods=200)
    df["y"] = z
    m = PMProphet(df, auto_changepoints=False, growth=True, intercept=False, name="model")
    with m.model:
        m.priors['growth'] = pm.Constant('growth_model', 1)
    m.add_seasonality(seasonality=3.14 * 2, fourier_order=3, mode=Seasonality.MULTIPLICATIVE)
    m.fit()
    m.predict(60, alpha=0.2, include_history=True, plot=True)
    m.plot_components(intercept=False)

you can probably try to see what happens in previous versions (maybe the sine series was already sufficient :)

BjorHart commented 4 years ago

Hi, I am still not sure the fourier series are modelled correctly in the updated master. In line 709 in the master, the fourier order is set to max(order) + 1, witch is setting the fourier order to 8, resulting in self.data having 16 entries of sine/cos. I believe the result is actully correct as the enumeration of self.seasonality in line 455 ignores the last 8 entries of self.data

luke14free commented 4 years ago

Hi @BjorHart, that's not correct actually. What 709 does is to set the seasonality, say in case you do 8 it will be:

f_{period}_{order}_{sin/cos} starting from 0 to 7 (that's why the +1) with sin/cos. This is given by line 193 (double for loop both on the period and on the function). This extends the self.seasonality to have all the seasonality duplicated with sin/cos. So the enumeration at line 455 doesn't ignore the remaining 8 because they are all added to the same array.

You can inspect model.seasonality and also model.data to check what I said :) or simply use a debugger.

BjorHart commented 4 years ago

Thank you for getting back to me so quickly! ah yes, I am sorry, I have a local version that I fixed in a different way, but i now see what i missed in your solution. thank you!

BjorHart commented 4 years ago

Hi again! I was trying to run the multiplicative example that you provided. I keep getting divergencies during the sampling process. How may this affect our results?

luke14free commented 4 years ago

Is this with the code on master?

BjorHart commented 4 years ago

yes thats correct!