robjhyndman / forecast

Forecasting Functions for Time Series and Linear Models
1.1k stars 340 forks source link

Even when using the same parameters (such as the values of p, d, q), the results can be drastically different in R and Python. #957

Closed kestlermai closed 2 months ago

kestlermai commented 2 months ago

R 4.2.1; forecast 8.22.0 :

arima <- arima(train_data, order = c(0, 1, 1), seasonal = list(order = c(0, 1, 1), period = 12)) summary(arima)

Series: train_data ARIMA(0,1,1)(0,1,1)[12]

Coefficients: ma1 sma1 -0.193 -0.791 s.e. 0.091 0.084

sigma^2 = 181: log likelihood = 37.83 AIC=-69.66 AICc=-69.45 BIC=-61.32

Python 3.11; statsmodels 0.14.1 :

model = SARIMAX(train_data['incidence'], order=(0,1,1), seasonal_order=(0,1,1,12)) result = print(result.summary())

SARIMAX Results Dep. Variable: incidence No. Observations: 132 Model: SARIMAX(0, 1, 1)x(0, 1, 1, 12) Log Likelihood -99.484 Date: Mon, 29 Apr 2024 AIC 204.969 Time: 23:46:06 BIC 213.306 Sample: 0 HQIC 208.354

ma.L1 -0.6900 0.048 -14.322 0.000 -0.784 -0.596 ma.S.L12 -0.8250 0.102 -8.081 0.000 -1.025 -0.625 sigma2 0.2766 0.019 14.838 0.000 0.240 0.313

Ljung-Box (L1) (Q): 0.73 Jarque-Bera (JB): 438.41 Prob(Q): 0.39 Prob(JB): 0.00 Heteroskedasticity (H): 1.21 Skew: -0.82 Prob(H) (two-sided): 0.56 Kurtosis: 12.26

Using the same parameters in two different software packages results in drastically different model performances. For example, in R: log likelihood = 37.83, aic = -69.66; while in Python: Log Likelihood = -99.484, AIC = 204.969.

Can you help me?

robjhyndman commented 2 months ago
  1. I don't know what objective function is used by statsmodels. But even if the docs say it is maximum likelihood, there are many variations. R is using a state space representation with a diffuse prior as explained in the documentation for stats::arima(): Other objective functions may yield different results. See
  2. Whatever objective function is used, it will contain local optima and there is no guarantee that the software finds the global optimum. See
  3. The AIC/BIC/etc depends on the likelihood, so different likelihood functions lead to different information criteria. Even with the same likelihood function, some software implementations omit the constant in the calculation. See
  4. The best Python implementation of ARIMA models that I know of is provided by StatsForecast:
kestlermai commented 2 months ago

Thank you very much for your reply. When I tried to use the StatsForecast to build an ARIMA model, the results still differed significantly from those obtained by running R. Under the same parameters {order=(0, 1, 1), season_length=12, seasonal_order=(0,1,1)}, MAPE: is 4.922 in R and 14.463 in Python. This may be attributed to different software algorithms? Anyway, thank you very much for your help.

robjhyndman commented 2 months ago

A MAPE difference that large suggests something's gone wrong in the Python model.