jrmontag / STLDecompose

A Python implementation of Seasonal and Trend decomposition using Loess (STL) for time series data.
MIT License
180 stars 49 forks source link

stldecompose.forecast() fails on a dataframe with offsetalias/freq=T #8

Closed jamiekris closed 5 years ago

jamiekris commented 6 years ago

I have been looking at this project lately and noticed that the forecast function fails when I pass a dataframe that has a frequency offset of a minute (not exactly a minute, but any frequency that is a multiple of T/min).

I tried to look closer, but could not figure out the exact reason. Even the usage example (https://github.com/jrmontag/STLDecompose/blob/master/STL%20usage%20example.ipynb) breaks when I resample with frequency in minutes. Any help/pointers would be great.

jrmontag commented 6 years ago

Thanks for flagging this issue @jamiekris. I'll have a look when I get a chance.

lovelyzoo commented 6 years ago

I have encountered a similar problem with hourly data. In my case the problem is on line 104 of stl.py: ix_start = stl.observed.index[-1] + pd.Timedelta(1, stl.observed.index.freqstr) The issue is that index.freqstr is an alias offset while the pandas.Timedelta unit parameter only accepts the strings {‘ns’, ‘us’, ‘ms’, ‘s’, ‘m’, ‘h’, ‘D’}.

So, in my case, replacing the above with: ix_start = stl.observed.index[-1] + pd.Timedelta(1, 'h') works as a hacky get around. I expect that pd.Timedelta(1, 'm') would work for you @jamiekris.

The correct solution will probably be something along the lines of using a dictionary to convert from freqstr to unit.

jrmontag commented 6 years ago

Thanks for the follow up @lovelyzoo. That makes sense, and it looks like something I didn't catch when I was working out the example. I appreciate the pointers - I'll look into those further.

amir-rafieian commented 6 years ago

Hi, I have the same issue when dataset is on weekly basis, I mean my datetime index is something like:

DatetimeIndex(['2014-01-19', '2014-01-26', '2014-02-02', '2014-02-09',
               '2014-02-16', '2014-02-23', '2014-03-02', '2014-03-09',
               '2014-03-16', '2014-03-23'

So I set the datetime index freq as week: dataset.index.freq = 'W' when I check the index, freq is: freq='W-SUN'

and after I ran the forecast function but I got this error: ValueError: cannot cast unit W-SUN

ShenbagaKumar commented 6 years ago

Hi,

I ran in to the same issue when the date time index was MS - Month Start Frequency (http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases)

Error Message: ValueError: cannot cast unit MS

Thanks

lmssdd commented 5 years ago

a very simple fix would be: ix_start = stl.observed.index[-1] +1

ZJUguquan commented 5 years ago

@jrmontag Just change one line of code from

ix_start = stl.observed.index[-1] + pd.Timedelta(1, stl.observed.index.freqstr) 

to

ix_start = stl.observed.index[-1] + pd.Timedelta(stl.observed.index.freq.nanos)
chr-istoph commented 5 years ago

This should do it for any used unit: ix_start = stl.observed.index[-1] + pd.Timedelta(1, freq=stl.observed.index.freqstr)

https://github.com/jrmontag/STLDecompose/pull/10

jrmontag commented 5 years ago

Thanks for reporting this issue, folks. I've updated the way the forecast index is created in https://github.com/jrmontag/STLDecompose/commit/05e6dcd381bf9d49d514f6c93289307bf8b14499 and tested it against the example intervals mentioned in this thread (using .resample() on the example data set before forecasting).