alkaline-ml / pmdarima

A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.
https://www.alkaline-ml.com/pmdarima
MIT License
1.57k stars 231 forks source link

Support `predict_in_sample` with non-integer index #499

Closed hawksung1 closed 2 years ago

hawksung1 commented 2 years ago

Describe the bug

I used data with time index. but auto arima model result index with 0,1,2,3.... not 20220101, 20220102 ...

so I can not predict with datetime

To Reproduce


if __name__ == '__main__':
    import pandas as pd
    from pmdarima.arima import auto_arima
    from datetime import date, timedelta
    from statsmodels.tsa.arima.model import ARIMA

    data_len = 30
    time_column = []
    date = date(2022, 1, 1)
    for i in range(data_len):
        time_column.append(date + timedelta(days=i))
    value_column = range(data_len)

    dataframe = pd.DataFrame(value_column, columns=['val'], index=time_column)

    # pmdarima auto arima
    model = auto_arima(y=dataframe).fit(dataframe)
    model.predict_in_sample(start=4, end=8)  # success
    model.predict_in_sample(start="20220103", end="20220106")  # err

    # stats arima
    model = ARIMA(endog=dataframe, order=(0,1,0)).fit()
    model.predict(start=4, end=8)  # success
    model.predict(start="20220103", end="20220106")  # success

Versions

pmdarima == 1.8.5
statsmodels == 0.13.2

Expected Behavior

predict success as stats arima model

Actual Behavior

err in predict with string date

Additional Context

If I did something wrong with using pmdarima auto_arima,

please let me know

tgsmith61591 commented 2 years ago

We can pick this up as a feature request for our next release cycle

tgsmith61591 commented 2 years ago

@hawksung1 this was merged in #500 and will be present in the 2.0.0 release. Alternatively, you can build from master and have access to the feature immediately.