fernandalschumacher / ARCensReg

Fitting Univariate Censored Linear Regression Model with Autoregressive Errors
1 stars 0 forks source link

how to deal with seasonality #1

Open ghost opened 1 year ago

ghost commented 1 year ago

Hello,

first of all thank you for the R package!

I tried ARCensReg() out using the phosphorus data (up to observation 119). Given that, I first checked seasonality patterns via acf() and found that there is indeed seasonality at lag 6.

Although it's possible to estimate the ARCensReg model the convergence fails (default settings were used).

I am now asking myself whether there is anything wrong with using the function only for data augmentation / imputation (thus, a low order of p to maintain convergence should be fine) and then estimating a separate time series model where seasonality is incorporated (seasonal arima).

Thank you in advance.

fernandalschumacher commented 1 year ago

Dear user,

Thank you for your question.

I would say it is somewhat tricky to use the acf() function directly when dealing with censored data, given that it can not properly account for the partially observed values. Ideally, we would like to evaluate residuals instead - which is again nontrivial for censoring/missing data since we do not have fully observed data. One option would be to fit an independent censored regression model and use quantile residuals to compute acf, but I am not sure how well this kind of residual would be able to reflect unmodeled serial correlation or seasonality.

As you noted, it is possible to estimate an AR(6) instead, but that implies estimating several additional parameters since the current implementation does not allow setting them to zero (which could be an interesting modification for the future). You could use imputed values from ARCensReg to investigate seasonality, but using the imputed values in a noncensored model (such as the arima function) would not correctly account for our uncertainty regarding the imputed values. Also, the imputation is a conditional expectation, which should also be affected by misspecified seasonality.

In summary, I would not recommend using imputed values along with noncensored models for decision-making.

ghost commented 11 months ago

Thank you very much for your detailed answer!

I would add some further notes / thoughts:

(1) If my data is censored upwards, it indicates that the "real" value would have been higher than the measured one. But this censoring could rather have a weakening effect on auto-correlations than a confounding one. If seasonality can be seen at lag 6*x in the acf() of the censored time series, the "real" acf() may have higher seasonal correlations but not at other lags.

(2) You are absolutely right about the misspecification problem. Ideally, you model everything in one estimation procedure. By raising the AR order, ARCensReg() can capture seasonality - but with the risk of non-convergence and using more parameters than necessary which doesn't match with the objective of parsimonious modelling. So +1 for an ARCensReg estimation with coefficient restrictions!

Nevertheless, my idea was to utilize your amazing package with the assumption that the imputed values of a misspecified ARCensReg model are still better / more plausible than the censored ones.