How to make `predict_in_sample()` derived from the first `d` term of original data when differencing order (`d`) isn't zero

Describe the question you have

I think the the usage of predict_in_sample() in pmdarima(python pkg) is same as fitted() in forecast(R pkg).

But I find out that the output of predict_in_sample() in pmdarima(python pkg) is different with output of fitted() in forecast(R pkg) when difference order isn't zero.

I use the following python codes to generate the output of predict_in_sample() in three different differencing order (d) :

import numpy as np
from pmdarima.arima import ARIMA, auto_arima

for diff_ord in range(1,4):
    model = ARIMA(order=(2,diff_ord,1), out_of_sample_size=0, mle_regression=True, suppress_warnings=True)

    ori_time_series = np.array([0.49958017, 0.15162735, 0.86757565, 0.3093554, 0.20545085, -0.48288408, 0.6880291,
                                0.8461229, 0.8320223, -0.7372907, 0.6048833, 0.40874475, 0.57708055, 0.27590698,
                                -0.21213382, 0.4236031, 0.3324298, -0.076647766, -0.20372462, 0.93162024, 0.5740154])

    model = model.fit(ori_time_series)
    pred_in_sample = np.array(model.predict_in_sample())
    print(f"pred_in_sample: {list(pred_in_sample)}")

I copy the output of above python codes, then paste to a R script to compare the difference between predict_in_sample() in pmdarima and fitted() in forecast(R pkg), R script:

par(mfrow=c(3,2))

resid_py_arr = rbind(c(0.5074603452679165, -0.34007254404457005, 0.5791635465131197, -0.2603422455105438, -0.11715725264909946, -0.9906800274596668, 0.21001659120388472, 0.3475615524863871, 0.7611416545018854, -0.8381194770281715, 0.2623376398458589, -0.23206647709890738, 0.4220053862092831, 0.06219916252952257, -0.4222181747488597, 0.038538708232362495, -0.08313995790478867, -0.26009526647620496, -0.48253573807639416, 0.5137650759426555, 0.34360142378578173),
                     c(0.49261470998182477, -0.6081910499091976, 1.0569346650372444, -0.6076380990855064, -0.2378976168316683, -0.8352597431282236, 1.0357942689078028, 0.631678563908118, 0.2767179073401864, -1.6808983058879592, 0.5653496038522947, 0.09552482041758414, 0.3659123662710821, -0.3491598976266268, -0.6745863302200333, 0.27064401578860775, 0.05909012652824919, -0.3734973292295873, -0.4340483262207756, 0.9099684919779161, 0.09388976669825833),
                     c(0.4940135127419584, -0.8586664744669461, 1.3118624972776918, -2.3436128660405227, -0.11312237780455078, -0.2567555216685367, 1.8351322442185336, 0.16659999993838948, -0.7634839186197411, -2.006759899752263, 1.8390440448281202, 0.3681746393692201, -0.30073609275686164, -0.6107051995729558, -0.5143875708651311, 0.8646807292504737, 0.03706843182642394, -0.7386026479980223, -0.1930232885395262, 1.3558665622526413, -0.579683995736004))

fitted_val_py_arr = rbind(c(-0.007880175267916512, 0.49169989404457004, 0.28841210348688023, 0.5696976455105438, 0.32260810264909945, 0.5077959474596667, 0.47801250879611523, 0.4985613475136129, 0.07088064549811457, 0.1008287770281715, 0.3425456601541411, 0.6408112270989074, 0.15507516379071695, 0.21370781747047746, 0.21008435474885973, 0.3850643917676375, 0.4155697579047887, 0.18344750047620495, 0.27881111807639414, 0.4178551640573445, 0.23041397621421822),
                          c(0.006965460018175224, 0.7598183999091976, -0.18935901503724428, 0.9169934990855064, 0.4433484668316683, 0.35237566312822355, -0.34776516890780296, 0.21444433609188196, 0.5553043926598136, 0.9436076058879592, 0.0395336961477053, 0.31321992958241585, 0.21116818372891794, 0.6250668776266268, 0.4624525102200333, 0.15295908421139226, 0.2733396734717508, 0.2968495632295873, 0.23032370622077558, 0.02165174802208386, 0.48012563330174163),
                          c(0.005566657258041537, 1.0102938244669462, -0.44428684727769174, 2.6529682660405225, 0.3185732278045508, -0.22612855833146328, -1.1471031442185335, 0.6795229000616105, 1.595506218619741, 1.2694691997522631, -1.2341607448281202, 0.040570110630779865, 0.8778166427568617, 0.8866121795729558, 0.30225375086513107, -0.44107762925047367, 0.29536136817357606, 0.6619548819980223, -0.010701331460473806, -0.4242463222526414, 1.153699395736004))

y<-c(0.49958017, 0.15162735, 0.86757565, 0.3093554, 0.20545085, -0.48288408, 0.6880291,
     0.8461229, 0.8320223, -0.7372907, 0.6048833, 0.40874475, 0.57708055, 0.27590698,
     -0.21213382, 0.4236031, 0.3324298, -0.076647766, -0.20372462, 0.93162024, 0.5740154)  # The statistical part of the question is understanding that the in-sample one-step-ahead forecasts of an ARIMA model are actually the fitted values of that model. In R, the method fitted applied on model output object normally returns the fitted values of the model. However, the method is not applicable to the output of function arima. There is a workaround: fitted values equal original values minus residuals. Residuals can be extracted from a fitted object using the method residuals (and that applies to the output of function arima).

for (dif_ord in seq(1:3)) {
  #  Better still, use the forecast package which does have a fitted method for outputs from Arima and auto.arima. – Rob Hyndman Feb 26, 2016 at 9:49
  #install.packages('forecast')
  library(forecast)
  fit.model.2 <- Arima(y, order = c(2, dif_ord, 1))

  resid_r_forecast_arima <- residuals(fit.model.2)
  resid_py <- resid_py_arr[dif_ord,]

  plot.ts(y, xaxp = c(0, 21, 21), ylim = c(-2,2))
  axis(2, at = seq(-1.5, 1.5, 0.5), tck = 1, lty = 2, col = "grey", labels = NA)  # Add horizontal grid 
  axis(1, at = 1:21, tck = 1, lty = 2, col = "grey", labels = NA)  # Add vertical grid
  lines(resid_r_forecast_arima, col=2, lty=2)
  lines(resid_py, col=3, lty=3)
  legend(13, -1, c("origin", "resid_r", "resid_py"), col=1:3, lty=1:3, cex=1, ncol=3, y.intersp=0, x.intersp=0, text.width=0.9)
  mtext(paste("Check residual series trend for diff order:", dif_ord))

  fitted_val_r_forecast_arima <- fitted(fit.model.2)
  fitted_val_py <- fitted_val_py_arr[dif_ord,]
  plot.ts(y, xaxp = c(0, 21, 21), ylim = c(-2,2))
  axis(2, at = seq(-1.5, 1.5, 0.5), tck = 1, lty = 2, col = "grey", labels = NA)  # Add horizontal grid 
  axis(1, at = 1:21, tck = 1, lty = 2, col = "grey", labels = NA)  # Add vertical grid
  lines(fitted_val_r_forecast_arima, col=2, lty=2)
  lines(fitted_val_py, col=3, lty=3)
  legend(13, -1, c("origin", "fitted_r", "fitted_py"), col=1:3, lty=1:3, cex=1, ncol=3, y.intersp=0, x.intersp=0, text.width=1)
  mtext(paste("Check fitted series trend for diff order:", dif_ord))
}

output image of R script:

I observe that: for the first d(difference order) term has huge difference, then they getting close after d(difference order) term.
And I think the output of forecast(R package) is correct., based on:
- the response in https://github.com/alkaline-ml/pmdarima/issues/140#issuecomment-926592471
  
  This produces some junky values for the first d indices when predicting in-sample.
- and observe first d term of original time-series(black line) and output of fitted() (red line)

My questions are:

Do I have any misunderstanding on above description? If no,
How should I adjust the parameters of predict_in_sample() in pmdarima to get the same output with fitted() in forecast?
- I have tried the the parameter:start, but it only shorten the length of output of predict_in_sample().

Versions (if necessary)

No response

alkaline-ml / pmdarima

How to make `predict_in_sample()` derived from the first `d` term of original data when differencing order (`d`) isn't zero #533

Describe the question you have

Versions (if necessary)