antoinecarme / pyaf

PyAF is an Open Source Python library for Automatic Time Series Forecasting built on top of popular pydata modules.
BSD 3-Clause "New" or "Revised" License
458 stars 73 forks source link

Failure to build a multiplicative ozone model with Lag1 trend #220

Closed antoinecarme closed 1 year ago

antoinecarme commented 1 year ago

script to reproduce (to be committed)

# ozone datset
lEngine.mOptions.set_active_trends(["Lag1Trend"]);
lEngine.mOptions.set_active_decomposition_types(["TSR"]);
lEngine.mOptions.set_active_transformations(["None"]);
lEngine.mOptions.set_active_autoregressions(["AR"]);

fails with

/usr/lib/python3/dist-packages/numpy/core/_methods.py:239: RuntimeWarning: overflow encountered in multiply
  x = um.multiply(x, x, out=x)
/home/antoine/dev/python/packages/timeseries/pyaf/pyaf/TS/Perf.py:161: RuntimeWarning: overflow encountered in square
  self.mL2 = np.sqrt(np.mean(abs_error ** 2))
/home/antoine/dev/python/packages/timeseries/pyaf/pyaf/TS/Perf.py:76: RuntimeWarning: overflow encountered in square
  SSRes = np.sum((signal.values - estimator.values)**2)
/home/antoine/dev/python/packages/timeseries/pyaf/pyaf/TS/SignalDecomposition_AR.py:43: RuntimeWarning: overflow encountered in multiply
  df[self.mOutName + '_residue'] = lSignal - (lTrend * lCycle * lAR)
/home/antoine/dev/python/packages/timeseries/pyaf/pyaf/TS/Perf.py:66: RuntimeWarning: invalid value encountered in divide
  self.mSMAPE = np.mean(2.0 * abs_error / sum_abs)
/usr/lib/python3/dist-packages/numpy/core/_methods.py:181: RuntimeWarning: invalid value encountered in reduce
  ret = umr_sum(arr, axis, dtype, out, keepdims, where=where)
/usr/lib/python3/dist-packages/numpy/core/_methods.py:215: RuntimeWarning: invalid value encountered in reduce
  arrmean = umr_sum(arr, axis, dtype, keepdims=True, where=where)
ERROR:pyaf.std:Failure when computing perf ['Ozone_Forecast_12'] 'Ozone_Forecast_12'array must not contain infs or NaNs
Traceback (most recent call last):
  File "/home/antoine/dev/python/packages/timeseries/pyaf/pyaf/TS/Perf.py", line 115, in compute
    return self.real_compute(signal, estimator, name);
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/antoine/dev/python/packages/timeseries/pyaf/pyaf/TS/Perf.py", line 168, in real_compute
    self.mPearsonR = self.compute_pearson_r(signal , estimator);
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/antoine/dev/python/packages/timeseries/pyaf/pyaf/TS/Perf.py", line 135, in compute_pearson_r
    (r , pval) = pearsonr(signal , estimator)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/scipy/stats/_stats_py.py", line 4452, in pearsonr
    normym = linalg.norm(ym)
             ^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/scipy/linalg/_misc.py", line 146, in norm
    a = np.asarray_chkfinite(a)
        ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/numpy/lib/function_base.py", line 628, in asarray_chkfinite
    raise ValueError(
ValueError: array must not contain infs or NaNs
antoinecarme commented 1 year ago

FIX : in multiplicative models (TSR), force all model components to have reasonable values (AR model values are above 1e+100 for higher horizon values).

This fix has the advantage to have the less side effects.

antoinecarme commented 1 year ago

Fix :

image

antoinecarme commented 1 year ago

image

antoinecarme commented 1 year ago

This may not be a perfect solution , but , for the moment, it is an efficient workaround.

The values are clipped only when the model is computed which does not match individual component values product in the forecast output pandas dataframe.

antoinecarme commented 1 year ago

FIXED.