Open rajnish-garg opened 4 years ago
I think the general answer is that you actually can't. Your forecasting model assumes a level of stationarity in your series and for heavily impacted series Covid-19 shock just breaks it. Maybe a model with a regime change might help but we haven't got any past pandemics to learn from. Time series that I am working on all have their seasonalities broken with European population in lockdown, there's not much you can do other than maybe switch to a short-term ARIMA.
Although I believe it might be a lost cause to try to model anything in the short-term, I am wondering what strategies people might use to mitigate the shock after the pandemic. Just set pandemic period to missing data and hope for the best? Can we do better?
+1 for @rbagd's comment, Prophet assumes stationary seasonalities which is probably untrue for most human-related time series right now. In some areas there may be a long enough period of lockdown to check how stationary the seasonalities are within the lockdown period, and potentially fit a separate model just to that period. But there are so many frequent external shocks to the series (policy changes, etc.) that if things have stabilized, they probably won't stay stable for long.
As for after - I think we'll have to see what will end up working. Just throwing out the pandemic period as unuseful data could work but will really depend on what things look like after the pandemic clears up, and how similar they are to how things were before.
Just throwing out the pandemic period as unuseful data could work but will really depend on what things look like after the pandemic clears up
@bletham Or input the quarantine/lockdown periods as "holidays" in the model for full dark irony
I guess since the covid-19 pandemic has numerous impacts which also vary on the nature of what you are trying to forecast and the business related to it... it's pretty much impossible to define a standard set of rules to insert in a statistical modelling library...
great results here https://medium.com/@andrejusb/covid-19-growth-modeling-and-forecasting-with-prophet-2ff5ebd00c01 I used this model on my application https://profetadocorona.herokuapp.com
I had a discussion with some other forecasters on this question last week, and one thing that came up that I wanted to mention for other people trying to salvage their forecasts is that you an specify different seasonalities for different periods of time, in particular here it is possible to have e.g. weekly seasonalities that are different "pre-corona" and "during-corona", like here: https://facebook.github.io/prophet/docs/seasonality,_holiday_effects,_and_regressors.html#seasonalities-that-depend-on-other-factors . That could be helpful for some time series.
@bletham Hello Ben, thanks for the answer. One question though, in retail the corona shock is always playing negative on the sales, however by adding the pre-corona and during-corona seasonalities, prophet seems to mistakenly think during-corona, the sales peaks fly high.
Not sure if I am doing this righ, but can you please weightin here? Thank you!
Here're my parameters:
def is_Not_corona_period(ds):
date = pd.to_datetime(ds)
return (date.year != 2020)
temp['post_corona'] = ~temp['ds'].apply(is_Not_corona_period)
temp['pre_corona'] = temp['ds'].apply(is_Not_corona_period)
m3 = Prophet(holidays=holidays,
interval_width = 0.99, # default is 80.
holidays_prior_scale=0.25,
changepoint_prior_scale=0.5,
seasonality_mode='multiplicative',
yearly_seasonality=10,
weekly_seasonality=False,
daily_seasonality=False)
m3.add_seasonality(name='annually-pre-corona',
period=365,
fourier_order=10,
condition_name='pre_corona')
m3.add_seasonality(name='annually-post-corona',
period=365,
fourier_order=10,
condition_name='post_corona')
and the resulting flyhigh:
@bletham I want to throw an idea into this mix, which I have implemented in linear regression models I maintain at work. Now this isn't using Prophet but simple linear regression but I believe this can be extended.
I am using log-log linear regression model
to estimate price elasticities (didnt use ARIMAX as that might reduce the price elasticity effect) of customers and then predict into the future, since the covid period my models deteriorated considerably. The challenge I had was that training another set models for covid would mean going back to model risk and agree on the coefficients(price elasticities) and the impacts it will have also not to forget feature selection.
Thus, since I couldnt change coefficients, I introduced moving average into my linear regression models (an idea borrowed from Time Series), what this means is that I am adjusting my intercept after fitting my model.
here are the steps I went through:
I have models in R thus here is the code of what I did.
train_error = train_actual - train_preds_
test_error = test_actual - test_preds_
train_preds_ = train_preds_ + lag(as.vector(rollmean(train_error,k = 2,fill = FALSE,align = 'right')),default = 0)
test_preds_ = test_preds_ + lag(as.vector(rollmean(test_error,k = 2,fill = FALSE,align = 'right')),default = 0)
What do you think about this approach ? This has improved my models performance considerably as it’s adapting and learning from its past errors.
@yuzuhikorunrun the problem there is that you're fitting a yearly seasonality (annually-post-corona
) with quite a bit less than a year of data. The 2020 seasonality for June on is thus totally unconstrained, and in this case is blowing up in a really bad way. That's because by default there is very little regularization on the fitted seasonalities. You could specify the prior_scale
in add_seasonality
to something small like 0.1 which would clamp down on that resonance, but really it probably doesn't make sense to try to predict yearly seasonality for 2020 distinctly from previous years. In my earlier post I was thinking more like weekly seasonality, where we do have multiple weeks of during-corona and can reasonably fit the seasonality.
@shoaibkhanz I guess what you're doing is along the lines of fitting a model to the residuals (in this case the model is a rolling mean). That makes a lot of sense, thanks for sharing!
Thanks @bletham , I am glad that you found that useful.
@bletham Hello thank you for the prompt reply, I appreciate this. Wondering if I can set it as monthly seasonality since my data are monthly data and I only have Jan-March upon this time. thank you!!!
@yuzuhikorunrun Monthly seasonality would mean a cycle within a month, and so wouldn't be appropriate here. Unfortunately with monthly data I don't think there is a whole lot that can be learned with just a few months of data. You'll have to let the trend component capture the chang due to COVID; because it is a change right at the end of the time series, you might need to increase the changepoint_range
to something like 0.95 (see https://facebook.github.io/prophet/docs/trend_changepoints.html for more details).
@bletham Unfortunately, setting changepoint_range to 0.95 or 0.9 does not improve the model performance and it failed to capture the sudden decreased sales due to COVID shock in Feb and March (not so much impact in Jan). I do have some interesting (or just lucky) findings and I'd love to hear your thoughts on this.
I have monthly data up to March-2020, and I mistakenly added monthly seasonality to my parameters (my data has multiplicative yearly trend)
def is_corona_period(ds):
date = pd.to_datetime(ds)
return (date.year == 2020)
temp['post_corona'] = temp['ds'].apply(is_corona_period)
temp['pre_corona'] = ~temp['ds'].apply(is_corona_period)
m5.add_seasonality(name='monthly-pre-corona',
period=30,
prior_scale = 0.1,
fourier_order=10,
condition_name='pre_corona')
m5.add_seasonality(name='monthly-post-corona',
period=30,
prior_scale = 0.1,
fourier_order=10,
condition_name='post_corona')
then, boom, it actually performs better in predicting the Feb and March data (after adding Jan-2020 data to train):
Before-Adding-wrong-seasonlity
Also notice here my extra regressor's impact is positive which is expected.
beta-parameters:
After-Adding-wrong-seasonality
Notice my extra-regressor here plays negative instead which is not expected...
a quick comparison with true-value, predict-values.
beta-parameters:
Do you think this is purely luck? And any idea of what might be going on here to make this lucky improvement happen?
Thank you again for making this amazing tool.
-Best.
@shoaibkhanz : It is an interesting idea to have a rolling mean approach to learn the residuals on top of the current model. It works well as a retrofit approach without doing much changes to the existing set-up. But I wonder rolling means are sufficient. Because by nature, means are sluggish in response and do not adapt to the changes quickly. When things start improving, the model might be slower to respond?
Would it be a good idea to add one more term like the difference of rolling mean errors? say between k=2 and k=3? trainpreds = trainpreds + lag(as.vector(rollmean(train_error,k = 2,fill = FALSE,align = 'right')),default = 0) + {difference between( rollmean k=2, rollmean k=3) }
@shoaibkhanz - how do you fit the rolling means into prophet model?
I am using fbprophet for a monthly sales prediction problem (5 years of historical data with 12 months ahead prediction) and have been researching on options on how to deal with covid-19 shock.
This article on how to forecast demand despite COVID on Medium summarise three options and here I shared their equivalent fixes for fbprophet:
I tried the third option and used PMI (purchasing managers' index) which is a leading economic indicator as additional regressor. It works perfectly in absorbing the covid-19 shock in Q2 this year and it is able to minimise the impact of pandemic on seasonality.
Update: However, the limitations of such indicators are that they are usually available for short term and hence the third option may not work for long term as we need reliable future values. Hence, an alternative is to combine and transform this indicator (that can reflect covid shock in your data) into a binary regressor based on outlier detection approach to specify different seasonalities e.g is_covid, is_not_covid. In this case, the future values can be FALSE assuming is_not_covid.
- Use External Drivers (Even Better Solution): use additional external data or economic indicator as regressor
Regarding @mikocml answer above, I tried to add Google's mobility reports as a regressor to my data (https://www.google.com/covid19/mobility/)
It unfortunately did not make much difference for my retail data, but might me useful for others. My data is a large supermarket chain where sales increased massively during lockdown.
It has been over a month since the last reply. Has anybody come up with other solutions?
Another source of external data could be Business Confidence Index for your country: https://data.oecd.org/leadind/business-confidence-index-bci.htm#indicator-chart
Only comes in monthly data though. Can this be applied to daily data? It will be one single figure for a whole 30 days.
I have another interesting one that challenges more how Prophet captures and defines the trend. It resembles a bit the question in #697. Basically, my data is Store order data for a large food retailer.
Model :
What I am struggling to capture is the structural changes in the trend and how we can model upcoming changes. Specifically:
Two different elements I struggle to correctly model
When looking at my historical data. Prophet captures
Option I see to help model capture this better :
Ideal would be something like #1789 as that would really capture the reality best (and tackle issue 2 as well). Implementing something as #705 is outside of my skill range
Don't know what other's opinions are.
Our expectation is that the trend will drop again a couple of %-points after restaurants reopen. No idea how to capture this structurally. Options I see:
Regressor route is not an option as model has never seen it. Again #705 would be nice to allow model to capture drop and then force a flat trend after, but still outside my skill range.
Any idea's?
Thanks
https://facebook.github.io/prophet/docs/handling_shocks.html
this may help, cheers
Hi, I am using fbprophet on daily time series with 5 years of data. Due to this covid-19 pandemic most of our metrics are impacted. Because we are using last 20% of data for testing, so it is not able to capture the signals. Is there any recommendation on handling these issues.