RJT1990 / pyflux

Open source time series library for Python
BSD 3-Clause "New" or "Revised" License
2.11k stars 240 forks source link

api example #48

Closed zhouhoo closed 7 years ago

zhouhoo commented 7 years ago

could you give some example about ARIMAX model's predict function? I can not find usage in your docs. I use it like this: model = pf.ARIMAX(data=group,formula='passengerCount~1',ar=1,ma=1) x = model.fit() pre = model.predict(18) print(pre)

got error: Error evaluating factor: TypeError: 'NoneType' object is not subscriptable passengerCount~1

RJT1990 commented 7 years ago

Hi, could you give a print out of what the head of the dataframe looks like? Also what x.summary() is?

Then I can identify whether it is a data issue or a bug in the code.

Thanks! Ross

zhouhoo commented 7 years ago
WIFIAPTag                       passengerCount        slice10min

1 E1-1A-1 8.1 2016-09-10-18-50 2 E1-1A-1 19.7 2016-09-10-19-0

ARIMAX(1,0,1)
======================================================= ================================================== Dependent Variable: passengerCount Method: MLE
Start Date: 2 Log Likelihood: -1451.0809
End Date: 553 AIC: 2910.1619

Number of observations: 552 BIC: 2927.416

Latent Variable Estimate Std Error z P>|z| 95% C.I.
======================================== ========== ========== ======== ======== ========================= AR(1) 0.9198 0.0174 52.8429 0.0 (0.8857 | 0.954)
MA(1) 0.3322 0.0408 8.1514 0.0 (0.2523 | 0.4121)
Beta 1 0.9779 0.285 3.4307 0.0006 (0.4192 | 1.5366)

Sigma 3.3529

thank you for your patience.

RJT1990 commented 7 years ago

Ah, I see the problem now - sorry, I should have seen this the first time.

With ARIMAX you are using exogenous variables (passengerCount~1) where the constant here is your set of explanatory variables. When you use predict for this type of model you need to pass in:

For your example, you just need to pass in a DataFrame of the same format as group but data for the exogenous variables out-of-sample.

In your case, you just have a constant, so you can actually just use an ARIMA model instead of an ARIMAX and use predict (where you don't need oos_data).

So to summarize:

  1. Models like ARIMAX rely on exogenous variable inputs; so predict() requires a oos_data argument.
  2. In your example, where you don't strictly have an exogenous variable, you can just use an ARIMA model where predict() does not require an oos_data argument.

See the examples on pyflux.com which should clarify the difference.

Thanks and let me know if this helped you!

zhouhoo commented 7 years ago

thanks for your advice. You help me out~ to be honest, I don't know what exogenous variables means, I'm new to here. could you give me an example DataFrame for oos_data of my data?

RJT1990 commented 7 years ago

Hi zhouhoo - sorry for late response; I think it's best if you google this question. For the library itself, you just need to make sure that the dataframe that you give for oos_data has the same column names. The exogenous variables here will reflect your assumptions about their future values. Thanks! Ross