facebook / prophet

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
https://facebook.github.io/prophet
MIT License
18.48k stars 4.53k forks source link

add_regressor with weather,holiday. Got TypeError: unsupported operand type(s) for +: 'float' and 'str' #741

Closed eromoe closed 5 years ago

eromoe commented 5 years ago

Hi,

I meet a strange situation, I am trying to scroll fit my data to see if prophet is good enough. My code is like:

df = ts.to_frame('y')
df.index = df.index.set_names(['ds'])
df.sort_index(inplace=True)

if df_extra is not None:
    df = df.join(df_extra)

df = df.fillna('0')

df = df.reset_index()

results = []

for i in range(scroll_size, df.shape[0]-1):
    m = Prophet(yearly_seasonality=False)
    for col in df_extra.columns:
        m.add_regressor(col, mode='multiplicative')

    try:
        m.fit(df.iloc[:i])
        future = df.iloc[i:i+1]
        forecast = m.predict(future)
    except:
        from IPython import embed; embed()
    results.append(forecast[['ds', 'yhat']])

But got error :

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
C:\Anaconda3\lib\site-packages\pandas\core\nanops.py in f(values, axis, skipna, **kwds)
    127                 else:
--> 128                     result = alt(values, axis=axis, skipna=skipna, **kwds)
    129             except Exception:

C:\Anaconda3\lib\site-packages\pandas\core\nanops.py in nanmean(values, axis, skipna)
    354     count = _get_counts(mask, axis, dtype=dtype_count)
--> 355     the_sum = _ensure_numeric(values.sum(axis, dtype=dtype_sum))
    356

C:\Anaconda3\lib\site-packages\numpy\core\_methods.py in _sum(a, axis, dtype, out, keepdims)
     31 def _sum(a, axis=None, dtype=None, out=None, keepdims=False):
---> 32     return umr_sum(a, axis, dtype, out, keepdims)
     33

TypeError: unsupported operand type(s) for +: 'float' and 'str'

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
e:\pp\mlc\mlc\models\prophet.py in prophet_scroll_train_predict2(ts, df_extra, scroll_size)
     64         try:
---> 65             m.fit(df.iloc[:i])
     66             # future = m.make_future_dataframe(periods=1, include_history=False)

C:\Anaconda3\lib\site-packages\fbprophet\forecaster.py in fit(self, df, **kwargs)
    937
--> 938         history = self.setup_dataframe(history, initialize_scales=True)
    939         self.history = history

C:\Anaconda3\lib\site-packages\fbprophet\forecaster.py in setup_dataframe(self, df, initialize_scales)
    255
--> 256         self.initialize_scales(initialize_scales, df)
    257

C:\Anaconda3\lib\site-packages\fbprophet\forecaster.py in initialize_scales(self, initialize_scales, df)
    315             if standardize:
--> 316                 mu = df[name].mean()
    317                 std = df[name].std()

C:\Anaconda3\lib\site-packages\pandas\core\generic.py in stat_func(self, axis, skipna, level, numeric_only, **kwargs)
   9612         return self._reduce(f, name, axis=axis, skipna=skipna,
-> 9613                             numeric_only=numeric_only)
   9614

C:\Anaconda3\lib\site-packages\pandas\core\series.py in _reduce(self, op, name, axis, skipna, numeric_only, filter_type, **kwds)
   3220             with np.errstate(all='ignore'):
-> 3221                 return op(delegate, skipna=skipna, **kwds)
   3222

C:\Anaconda3\lib\site-packages\pandas\core\nanops.py in _f(*args, **kwargs)
     76                 with np.errstate(invalid='ignore'):
---> 77                     return f(*args, **kwargs)
     78             except ValueError as e:

C:\Anaconda3\lib\site-packages\pandas\core\nanops.py in f(values, axis, skipna, **kwds)
    130                 try:
--> 131                     result = alt(values, axis=axis, skipna=skipna, **kwds)
    132                 except ValueError as e:

C:\Anaconda3\lib\site-packages\pandas\core\nanops.py in nanmean(values, axis, skipna)
    354     count = _get_counts(mask, axis, dtype=dtype_count)
--> 355     the_sum = _ensure_numeric(values.sum(axis, dtype=dtype_sum))
    356

C:\Anaconda3\lib\site-packages\numpy\core\_methods.py in _sum(a, axis, dtype, out, keepdims)
     31 def _sum(a, axis=None, dtype=None, out=None, keepdims=False):
---> 32     return umr_sum(a, axis, dtype, out, keepdims)
     33

TypeError: unsupported operand type(s) for +: 'float' and 'str'

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
C:\Anaconda3\lib\site-packages\pandas\core\nanops.py in f(values, axis, skipna, **kwds)
    127                 else:
--> 128                     result = alt(values, axis=axis, skipna=skipna, **kwds)
    129             except Exception:

C:\Anaconda3\lib\site-packages\pandas\core\nanops.py in nanmean(values, axis, skipna)
    354     count = _get_counts(mask, axis, dtype=dtype_count)
--> 355     the_sum = _ensure_numeric(values.sum(axis, dtype=dtype_sum))
    356

C:\Anaconda3\lib\site-packages\numpy\core\_methods.py in _sum(a, axis, dtype, out, keepdims)
     31 def _sum(a, axis=None, dtype=None, out=None, keepdims=False):
---> 32     return umr_sum(a, axis, dtype, out, keepdims)
     33

TypeError: unsupported operand type(s) for +: 'float' and 'str'

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
e:\pp\mlc\mlc\models\prophet.py in <module>()
----> 1 m.fit(df.iloc[:i])

C:\Anaconda3\lib\site-packages\fbprophet\forecaster.py in fit(self, df, **kwargs)
    936         self.history_dates = pd.to_datetime(df['ds']).sort_values()
    937
--> 938         history = self.setup_dataframe(history, initialize_scales=True)
    939         self.history = history
    940         self.set_auto_seasonalities()

C:\Anaconda3\lib\site-packages\fbprophet\forecaster.py in setup_dataframe(self, df, initialize_scales)
    254         df.reset_index(inplace=True, drop=True)
    255
--> 256         self.initialize_scales(initialize_scales, df)
    257
    258         if self.logistic_floor:

C:\Anaconda3\lib\site-packages\fbprophet\forecaster.py in initialize_scales(self, initialize_scales, df)
    314                     standardize = True
    315             if standardize:
--> 316                 mu = df[name].mean()
    317                 std = df[name].std()
    318                 self.extra_regressors[name]['mu'] = mu

C:\Anaconda3\lib\site-packages\pandas\core\generic.py in stat_func(self, axis, skipna, level, numeric_only, **kwargs)
   9611                                       skipna=skipna)
   9612         return self._reduce(f, name, axis=axis, skipna=skipna,
-> 9613                             numeric_only=numeric_only)
   9614
   9615     return set_function_name(stat_func, name, cls)

C:\Anaconda3\lib\site-packages\pandas\core\series.py in _reduce(self, op, name, axis, skipna, numeric_only, filter_type, **kwds)
   3219                                           'numeric_only.'.format(name))
   3220             with np.errstate(all='ignore'):
-> 3221                 return op(delegate, skipna=skipna, **kwds)
   3222
   3223         return delegate._reduce(op=op, name=name, axis=axis, skipna=skipna,

C:\Anaconda3\lib\site-packages\pandas\core\nanops.py in _f(*args, **kwargs)
     75             try:
     76                 with np.errstate(invalid='ignore'):
---> 77                     return f(*args, **kwargs)
     78             except ValueError as e:
     79                 # we want to transform an object array

C:\Anaconda3\lib\site-packages\pandas\core\nanops.py in f(values, axis, skipna, **kwds)
    129             except Exception:
    130                 try:
--> 131                     result = alt(values, axis=axis, skipna=skipna, **kwds)
    132                 except ValueError as e:
    133                     # we want to transform an object array

C:\Anaconda3\lib\site-packages\pandas\core\nanops.py in nanmean(values, axis, skipna)
    353         dtype_count = dtype
    354     count = _get_counts(mask, axis, dtype=dtype_count)
--> 355     the_sum = _ensure_numeric(values.sum(axis, dtype=dtype_sum))
    356
    357     if axis is not None and getattr(the_sum, 'ndim', False):

C:\Anaconda3\lib\site-packages\numpy\core\_methods.py in _sum(a, axis, dtype, out, keepdims)
     30
     31 def _sum(a, axis=None, dtype=None, out=None, keepdims=False):
---> 32     return umr_sum(a, axis, dtype, out, keepdims)
     33
     34 def _prod(a, axis=None, dtype=None, out=None, keepdims=False):

TypeError: unsupported operand type(s) for +: 'float' and 'str'

Then I set breakpoint and dump the data:

df.iloc[:i-1] is fine : 1.xlsx

df.iloc[:i] is bad : 2.xlsx

I found that df.iloc[i] 's extra data is 0 0 0 0 0 0 , may be by this ? And trainning was fine without add_regressor .

bletham commented 5 years ago

It looks like one of the columns in df_extra is a string. Could you check:

df_extra.dtypes
bletham commented 5 years ago

The issue is that we convert the extra regressor columns to numeric, but that happens after initializing the scales (which is where this error is being raised). That needs to be moved up to before we initialize scales.

https://github.com/facebook/prophet/blob/master/python/fbprophet/forecaster.py#L290

bletham commented 5 years ago

This is fixed in https://github.com/facebook/prophet/commit/13d96cff8f0af2b520bec02bd35884a9a90fc097

bletham commented 5 years ago

Fix pushed to PyPI.