linkedin / greykite

A flexible, intuitive and fast forecasting library
BSD 2-Clause "Simplified" License
1.81k stars 104 forks source link

Dealing with lot of 0s (zeroes) in Greykite Multistage Forecasting #122

Open NTMoshoma opened 1 year ago

NTMoshoma commented 1 year ago

Is there a best way to deal with lots of Zeroes [ 0s ] in timeseries Dataframe using GreyKite Multistage Forecasting.

Basically the Business only gather Data between 08H00 am and 16h30 from Monday to Saturday and from 08H00 to 13H00 on Sunday?.

The data is collected every 30minutes and the business would like to run forecasting based on 30minutes frequency.

As a results of this the dataframe is filled with lots of 0s during the hours which the business is not operating. I have tried removing non business hours, Resampling, and log transformation but it does not seem to work too well.

pjgaudre commented 1 year ago

What you tried seems reasonable to me. If they are predictable zeros. i.e. deterministic to non operating hours, you could post process the greykite forecasts to add zeros at those times. Otherwise I would suggest resampling to a less sparse time frequency, forecasting at that frequency and then using historical proportions to go back to a 30 minutes frequency.