alkaline-ml / pmdarima

A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.
https://www.alkaline-ml.com/pmdarima
MIT License
1.6k stars 234 forks source link

auto_arima with missing values #555

Open elisevansartefact opened 1 year ago

elisevansartefact commented 1 year ago

Is your feature request related to a problem? Please describe.

I have some time series data at weekly granularity, but some weeks have NaN or None values.

The pmdarima auto_arima documentation says that the input time series data should not contain any np.nan or np.inf values. So I cannot use auto_arima without imputing values. This is unlike pmdarima arima, which allows the use of input data with NaN or None values. The same is true of the R auto.arima package.

Describe the solution you'd like

auto_arima should work with missing values. For example, [1, 3, 4, None, 5, 2, 3, None, 7] should not raise ValueError: Input y contains NaN, and auto_arima should train as normal.

Describe alternatives you've considered

Imputing values, but this means the model trains on incorrect data, and reproduces this 'synthetic data' pattern in future predictions.

Additional Context

No response