Addition of holiday impact score and custom holidays.

beginner1729 commented 3 years ago

This is more of a discussion and request about two new features addition in prophet

Scoring system so as to depict how much of an impact an holiday had on the predicted value at any given date. Lets name it Holiday Score
Ability to add custom holidays at custom dates (can be future or past). Customisation will be on the Holiday Score.

I am asking for these features based on a use case I recently solved using the aforementioned constructs. As I am only versed with the python implementation of Prophet, I will only be referring to only python sections of the code.

In the use case we were to predict the incoming call volume in a certain call centre, now the incoming volumes are heavily impacted for certain event days. We used prophet's holiday construct to model the volumes on those days.

Now during prediction, domain experts claimed that a certain future event day will have more impact than the claimed impact in the training data, about 120% of what was in the past . Now instead of changing the prediction dataframe and multiplying the holiday column with the said percent boost, we decided to boost the param in the model.params['beta'] which controlled the holiday factor for the given event day. This saw done so that the no extra steps are to be done after the prediction. Following the above steps we found the forecast adherence ( Actual Volume / Predicted Volume) to be about 0.97 for the said event day.

Changing the param directly allows us to add new type of holiday event in the future which may be similar to that of the previous events but may have slightly more or less impact. Like some kind of ad/business campaign which might be similar to that of a previous one.

For Holiday Score we used liner interpolation over all holiday params to put the params with in a 0 to 10 range (directly proportional to the value of the param), thus making it human readable and new holiday addition for an unseen holiday can be done just by adjusting those scores, later translating it back to the param value is taken care of in the back end.

Suggested addition would mean an updation of model.train_component_cols , model.holidays and model.params['beta']

To conclude the advantages of incorporating the said feature are :

Addition of new holiday which was not seen in the past data, whose estimation can be provided by some domain expert based on some previous holiday impact.
Having an Holiday Score, a normalised value of the holiday params, so the estimation of holiday impact is smoother.
Having the model store the new holidays amounts to a smoother pipeline during deployment. As changing of model will be much less frequent than that of the prediction.

bletham commented 3 years ago

This is an interesting idea - basically adding in holiday effects for future holidays that we haven't observed in the past, by using domain expertise to specify their value relative to the effect size seen for comparable holidays that have been observed in the past.

You've clearly thought a lot about this problem and I don't think I have much to add to your analysis of the pros/cons of various options. As you note, this can be done by directly adding the holiday effect into the forecast dataframe (which would be a purely pandas operation), but I can see how adding it as a model parameter could make productionization easier.

I'm pretty hesitant to add directly functionality for directly modifying the fitted model parameters; it seems like something that would be easy to mess up. What I would propose that I think would be the best path forward would be if you could post code here for making the necessary modifications to the model. It seems to me that all of this could be done via utility functions without having to modify any of the core package code. We could then link to this issue from the documentation so that advanced users who run into this same problem will be able to make use of the code you've provided, and then in the future if this starts to come up frequently we can evaluate folding it into the package.

beginner1729 commented 3 years ago

Thanks for the response @bletham. I understand your concern of playing around with the trained parameters, if something goes wrong one may have to retrain the full thing.

As you suggested, I will be posting the code snippets here itself. My knowledge is limited to only python implementation so my code will be only for python.

Addition of new holiday

Update the holidays dataframe model.holidays = pd.concat([model.holidays, new_holiday_df]). Where new_holiday_df contains date and name for the new holiday in the same format as model.holidays
Update the train_component_cols. This is a cross tab matrix which fixes the position of parameters in model.params['beta']. To do that set model.train_component_cols = None and run
```
_, _, component_cols, _ = (
        model.make_all_seasonality_features(model.history)
    )
model.train_component_cols = component_cols
```

Find the param position of the holiday.

param_pos = list(model.train_component_cols[new_holiday_name]).index(1)

Update the position of param so that new holiday params occurs in the param_pos

new_params = list(model.params['beta'][0])
new_params.insert(param_pos, new_holiday_param)
model.params['beta'] = np.array([new_params])

Updation in existing holiday

For this portion one only have to perform step 3 and 4 as the holiday is already present in model.holidays and model.train_component_cols

Note :

Finding out the value of holiday param can be tricky, I used linear interpolation of already existing params for other holidays and then extrapolated or interpolated based on domain expert feedback or business requirements. I leave that part to the final user itself.

bletham commented 3 years ago

This is great, thanks!

facebook / prophet

Addition of holiday impact score and custom holidays. #1873