Strategies for positive predictions

bletham commented 4 years ago

There has been a lot of discussion on strategies for forcing predictions to be positive, usually in the context of forecasting count data that naturally must be positive. I will illustrate some approaches with the following example dataset:

# Make a test problem
import numpy as np
import pandas as pd
from fbprophet import Prophet

ds = pd.date_range('2020-01-01', '2020-06-01')
t = np.arange(153)
seasonality = 0.5 * np.cos(t * 2 * np.pi / 7)  # weekly seasonality
trend = (-t + 30 * t * np.exp(-t / 40)) / 20
y = np.round(np.clip(trend * (1 + seasonality) * (1 + 0.2 * np.random.randn(153)), a_min=0, a_max=None))
df = pd.DataFrame({'ds': ds, 'y': y})

Approach 1: Clip predictions of a regular model The idea here is just to fit a usual model, and then clamp negative predictions up to 0. Note that we'll use multiplicative seasonality here - we'd probably always want to use multiplicative seasonality in settings with positive predictions.

# Fit a usual prophet model, and clip
m = Prophet(seasonality_mode='multiplicative').fit(df)
future = m.make_future_dataframe(90)
fcst = m.predict(future)
for col in ['yhat', 'yhat_lower', 'yhat_upper']:
    fcst[col] = fcst[col].clip(lower=0.0)
fig = m.plot(fcst)

prophet1

This approach guarantees positive predictions, but the uncertainty estimates are pretty unsatisfactory: the forecast has 0 uncertainty in the future, even though we've seen trend changes in the past. Is it reasonable to forecast no chance of the forecast coming above 0? In most settings, probably not. The reason there is no trend uncertainty being captured in the forecast is because all of the trend uncertainty is happening below 0, as can be seen in the components plot:

fig = m.plot_components(fcst)

prophet2 So all of the trend uncertainty is lost when it is clamped up to 0. # Approach 2: Logistic growth The logistic growth trend has a floor at 0, so the trend will stay positive. It does require specifying a maximum saturation value as well, which could be set to whatever the expected maximum of the forecast is. It also doesn't ensure the forecast will be positive - it only ensures the trend will be positive. The forecast yhat can still be pushed negative by seasonality. So we must also clip to 0 as above.

# Fit a logistic growth model, and clip
df['cap'] = 1.2 * df['y'].max()
m = Prophet(
    growth='logistic',
    seasonality_mode='multiplicative',
    changepoint_prior_scale=0.5,
).fit(df)
future = m.make_future_dataframe(90)
future['cap'] = 1.2 * df['y'].max()
fcst = m.predict(future)
for col in ['yhat', 'yhat_lower', 'yhat_upper']:
    fcst[col] = fcst[col].clip(lower=0.0)
fig = m.plot(fcst)

prophet3 The uncertainty estimate here isn't really any more satisfactory than that above. This is because the purpose of the logistic growth trend is to saturate; the documentation page is, afterall, called "Saturating Forecasts" (https://facebook.github.io/prophet/docs/saturating_forecasts.html). And so here it naturally saturates at 0, but we may not wish for 0 to be quite so sticky.

Approach 3: Log transform If we log transform y, make a forecast, and then take the exp of the forecast, it is guaranteed to be positive. This also changes the nature of the seasonality: additive seasonality in the log transform space corresponds to multiplicative seasonality in the original space (see https://github.com/facebook/prophet/issues/647#issuecomment-413027578)

# Log-transform the data
df['y'] = np.log(1 + df['y'])
m = Prophet(seasonality_mode='additive').fit(df)
future = m.make_future_dataframe(90)
fcst = m.predict(future)
# Invert the transform
m.history['y'] = np.exp(m.history['y']) - 1
for col in ['yhat', 'yhat_lower', 'yhat_upper']:
    fcst[col] = np.exp(fcst[col]) - 1
fig = m.plot(fcst)

prophet4 This is no better than the other approaches: the components plot shows that there is trend uncertainty, however it is all being squashed out by the exp inverse transform. Also, the exp inverse can produce serious numerical issues, which isn't an issue in this particular forecast but I've run into it in others. It's not a very generally applicable approach.

Approach 4: Negative binomial / Poisson likelihood There has been a lot of discussion around using a negative binomial or Poisson likelihood to handle count data, instead of the Gaussian likelihood currently used (#337 and #1500 have a lot of discussion). I prototyped this by implementing a negative binomial likelihood in #1544. Patching in that PR produces this:

# Negative binomial likelihood
df['cap'] = 1.2 * df['y'].max()
m = Prophet(
    growth='logistic',
    seasonality_mode='multiplicative',
    likelihood='NegBinomial',
    changepoint_prior_scale=0.5,
).fit(df)
future = m.make_future_dataframe(90)
future['cap'] = 1.2 * df['y'].max()
fcst = m.predict(future)
fig = m.plot(fcst)

prophet5 This produces results similar to what is seen above. Generally, the negative binomial likelihood seems like it would be appropriate for this type of small-integer data, but I'm not sold on it based on my experience so far. There is a lot of discussion on this in #1500, but on the example problems there its performance was underwhelming. It also involves a exp() transform in the likelihood (a hinge transform), which can produce numerical issues, and did in one of the datasets there. So I don't think it's going to be a robust and reliable strategy in its current state.

Approach 5: A positive trend model The default Prophet trend is a piecewise linear function. The problem we're dealing with here is that there is nothing to prevent the trend from going negative, and simply clamping it to 0 wipes out all of the future trend uncertainty.

Trend uncertainty is estimated with Monte Carlo sampling, by sampling future trends with the following simulation:

At each future time, sample whether or not there will be a trend change from a Poisson distribution (whose rate is estimated during model fitting).

If there is a trend change, sample the magnitude of the trend change from a Laplace distribution (whose scale is estimated during model fitting). Update the trend with that change, and continue forward in time. This procedure is described in Section 3.1.4 of the paper (https://peerj.com/preprints/3190.pdf). Suppose we modify this generative model to disallow trend changes that take the trend negative. Basically, when we are simulating a future trend, when it hits 0 and starts to go negative, we will instead add a new trend change that keeps it at 0. Then, positive future trend changes will still be able to take the trend positive again (it isn't stuck below 0 like above with clipping), and negative future trend changes will simply have no effect. This can be implemented by modifying the function that Prophet uses for calculating the piecewise linear trend. This class implements it:

class ProphetPos(Prophet):

@staticmethod
def piecewise_linear(t, deltas, k, m, changepoint_ts):
    """Evaluate the piecewise linear function, keeping the trend
    positive.

    Parameters
    ----------
    t: np.array of times on which the function is evaluated.
    deltas: np.array of rate changes at each changepoint.
    k: Float initial rate.
    m: Float initial offset.
    changepoint_ts: np.array of changepoint times.

    Returns
    -------
    Vector trend(t).
    """
    # Intercept changes
    gammas = -changepoint_ts * deltas
    # Get cumulative slope and intercept at each t
    k_t = k * np.ones_like(t)
    m_t = m * np.ones_like(t)
    for s, t_s in enumerate(changepoint_ts):
        indx = t >= t_s
        k_t[indx] += deltas[s]
        m_t[indx] += gammas[s]
    trend = k_t * t + m_t
    if max(t) <= 1:
        return trend
    # Add additional deltas to force future trend to be positive
    indx_future = np.argmax(t >= 1)
    while min(trend[indx_future:]) < 0:
        indx_neg = indx_future + np.argmax(trend[indx_future:] < 0)
        k_t[indx_neg:] -= k_t[indx_neg]
        m_t[indx_neg:] -= m_t[indx_neg]
        trend = k_t * t + m_t
    return trend

def predict(self, df=None):
    fcst = super().predict(df=df)
    for col in ['yhat', 'yhat_lower', 'yhat_upper']:
        fcst[col] = fcst[col].clip(lower=0.0)
    return fcst

It is used the same way as the usual Prophet class:

# Fit the ProphetPos model
m = ProphetPos(seasonality_mode='multiplicative').fit(df)
future = m.make_future_dataframe(90)
fcst = m.predict(future)
fig = m.plot(fcst)

prophet6 With this model, unlike all of the other approaches, the trend has the possibility to become positive again. It also avoids some of the downsides of other approaches: there is no requirement to specify a cap, and there are no numerically-unstable transforms.

Summary In settings where the trend saturates to 0 and we don't expect it to come back up, the simplest approach of fitting a default model and clipping to 0 (approach 1) may be the best. Logistic growth, log transform, and NB likelihood all come with potential for issues and didn't really do anything better than simple clipping here. If the trend may come back up, then only the ProphetPos approach can capture that well. I've tried this approach on a number of time series and so far have found it to be the best for the time series I've looked at, relative to these other strategies. I'd be very interested to hear if anyone else is able to try out ProphetPos and see how it works!

raffg commented 4 years ago

I've been following several of these threads and thought I'd weigh in now to test out the ProphetPos model. I've done some work with Divvy bike share data and ran into issues with negative count forecasts on it. The data is here: https://www.kaggle.com/yingwurenjian/chicago-divvy-bicycle-sharing-data. I aggregated to the count of rides per hour:

Using linear growth with multiplicative seasonality was my first model:

model = Prophet(seasonality_mode='multiplicative')
model.add_country_holidays(country_name='US')
model.fit(df)
future = model.make_future_dataframe(periods=365 * 24, freq='h')
forecast = model.predict(future)
fig = model.plot(forecast)
plt.title('Linear model')
plt.show()

Besides the negative predicted values, the other issue I see here is that it seems to have decided upon a flat trend. From the components plot, I can see that the trend is showing yearly seasonality, so maybe messing with the changepoints or their priors a bit to make them less sensitive could help. Only the logistic model avoided this issue.

At any rate, that's not what I wanted to test here.

I tried clipping negative values:

model = Prophet(seasonality_mode='multiplicative')
model.add_country_holidays(country_name='US')
model.fit(df)
future = model.make_future_dataframe(periods=365 * 24, freq='h')
forecast = model.predict(future)
for col in ['yhat', 'yhat_lower', 'yhat_upper']:
    forecast[col] = forecast[col].clip(lower=0.0)
fig = model.plot(forecast)
plt.title('Linear model with clipped prediction')
plt.show()

Logistic growth:

df['cap'] = 1.2 * df['y'].max()

model = Prophet(growth='logistic', seasonality_mode='multiplicative')
model.add_country_holidays(country_name='US')
model.fit(df)
future = model.make_future_dataframe(periods=365 * 24, freq='h')
future['cap'] = 1.2 * df['y'].max()
forecast = model.predict(future)
fig = model.plot(forecast, plot_cap=False)
plt.title('Logistic model')
plt.show()

Still predicts negative values due to the yearly, weekly, and daily seasonalities, but at least this model got the trend correct.

And log transform:

df['y'] = np.log(1 + df['y'])

model = Prophet(seasonality_mode='additive')
model.add_country_holidays(country_name='US')
model.fit(df)
future = model.make_future_dataframe(periods=365 * 24, freq='h')
forecast = model.predict(future)

model.history['y'] = np.exp(model.history['y']) - 1
for col in ['yhat', 'yhat_lower', 'yhat_upper']:
    forecast[col] = np.exp(forecast[col]) - 1

fig = model.plot(forecast)
plt.title('Linear model with log transform')
plt.show()

The trend has a negative slope in the final year of data and in the forecasted year, which really doesn't look correct to me.

Finally, the positive trend model:

model = ProphetPos(seasonality_mode='multiplicative')
model.add_country_holidays(country_name='US')
model.fit(df)
future = model.make_future_dataframe(periods=365 * 24, freq='h')
forecast = model.predict(future)
fig = model.plot(forecast)
plt.title('ProphetPos model')
plt.show()

This looks the best to me of the options presented (I didn't try the NB/Poisson option). I do like how the trend uncertainty levels out at 0 on the lower bound, even though I'm not happy with the general flatness of the trend, particularly in the final year, or the seasonality it still displays. I think I could probably fix that with some parameter tuning though.

bletham commented 4 years ago

Thanks for sharing these results! I looked at a similar bike share dataset in #1500 and drew a similar conclusion. The fit to the seasonality is pretty poor because there is yearly seasonality in the magnitude of daily seasonality, something that isn't covered by the stationarity daily seasonality. I'm not sure the best way to try to model such a pattern in Prophet.

numeric-lee commented 3 years ago

@bletham thanks for this helpful very comment. I tested ProphetPos on 37 real world time series. In almost every case I got the same yhat (zero) as with clipping and a slightly higher upper bound. There are some hypothetical cases where ProphetPos gives a positive yhat when Prophet is negative. My suggestion to most users is to stick with clipping for now.

For this time series ProphetPos returns yhat=0 and upper bound (1 std deviation) of 9.8 Do those conform to your expectations?

candalfigomoro commented 3 years ago

@bletham Is there a plan to merge ProphetPos into the prophet library?

bletham commented 3 years ago

@candalfigomoro Not at the moment, particularly since it can be used without requiring any changes or monkeypatching to the package. If it ends up being sufficiently useful I think the first step would be to add it to the documentation (though actually a reference to this issue and the discussion within may be worth linking to from the documents already at this point!).

yasirroni commented 2 years ago

@bletham is there any plan to make ProphetPos into separate library/package/repo then? And push it to pypi maybe. If not, may I or other take it (including adding MIT License to the repo)? Thank you.

AlexSiormpas commented 2 years ago

Hello fellow forecasters! :) Would there be an option to apply a solution similar to the one of the forecast package in R: https://robjhyndman.com/hyndsight/forecasting-within-limits/ Specifying λ=0 works without the need to input a maximum cap on the forecast.

sungla55guy commented 1 year ago

Another approach I've been testing is using the inverse hyperbolic sine IHS transformation which allows for zeros in the training data per this post https://davegiles.blogspot.com/2019/03/forecasting-after-inverse-hyperbolic.html. Estimation of theta can be determined using mle or a quick estimation can be done by scaling the data by mean or std.

facebook / prophet

Strategies for positive predictions #1668