facebook / prophet

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
https://facebook.github.io/prophet
MIT License
18.53k stars 4.54k forks source link

Use stable sort when sorting values in forecaster #2568

Closed natl closed 6 months ago

natl commented 7 months ago

When Prophet prepares a dataframe for fitting or predicting, it runs a sort on the ds column.

This changes the sort algorithm from the pandas default sort to mergesort, which is a stable sort.

This is important in situations where multiple regressors are present and so the fitted dataframe is not unique by date.

Users may do something like the following:

df = <some dataframe>
# Example df
# ds            x   y   
# 2020-01-01    0   3.2
# 2020-01-01    1   4.1
# 2020-01-02    0   3.7
# 2020-01-02    1   4.3

m = Prophet()
m.add_regressor('x')
m.fit(df)

predict_df = m.predict(df)
df['prediction'] = predict_df['yhat']

In m.predict(df), there is a call to df.sort_values('ds'). As this is not a stable sort, sometimes the x column above can get switched around, even if df is sorted. A stable sort addresses this.

This partially addresses #2322 if you construct the input correctly.