tidyverts / feasts

Feature Extraction And Statistics for Time Series
https://feasts.tidyverts.org/
295 stars 23 forks source link

Misleadings names/documentation for SEATS() and X11() #66

Closed AQLT closed 3 years ago

AQLT commented 5 years ago

Hi, I think that the names of the functions SEATS() and X11() (or the associated documentation) can be misleadings since, if I'm not mistaken, in both functions the seasonal decomposition is made with X-13ARIMA-SEATS. In both case a pre-adjustment step is done to adjust from trading days effects, some outliers, etc. And a forecast of this pre-adjusted series is done. Then it is this pre-adjusted series that is decomposes with SEATS or X-11.
Therefore, X11() doesn’t not perform a seasonal adjustment with X-11 but with X-13ARIMA-SEATS.

Maybe I should ask this other question in another issue but I was also wondering why the decomposition mode is forced with X11() (to additive) and not with SEATS()? By default X-13ARIMA-SEATS performs an automatic selection of the decomposition mode and the most common decomposition mode for economic time series is multiplicative.

mitchelloharawild commented 5 years ago

I'm no expert on X-11 nor X-13ARIMA-SEATS (however I did read through the X-13ARIMA-SEATS manual), so any expert knowledge with these decompositions would be helpful.

Does setting x11="" do something different than the X-11 decomposition? Does this refer to only the seasonal part of the decomposition?

As for X11 being additive, I was under the impression that X11 only supported additive decompositions.

AQLT commented 5 years ago

X-11 is often explained with an additive decomposition but it also support at least multiplicative, log-additive and pseudo-additive (y = t * (s + i - 1)) decomposition. seasonal supports at least multiplicative and additive, I don't know for the other decompositions. You can see with this example that the decomposition can be multiplicative: out(seas(AirPassengers, x11 = ""))

***** AICC (with aicdiff=-2.00) prefers log transformation *****
***** Multiplicative seasonal adjustment will be performed. ****

It seems to me that SEATS only supports additive and log-additive decomposition.

x11="" set the X-11 as the decomposition method but a pre-adjustment step is still done before the decomposition and the results are putted in the A-tables (whereas in the "original" X-11 there were computed during the decomposition). So this is different from only doing a X-11 decomposition, especially because this is not the original series that is decomposed but the pre-adjusted series.

mitchelloharawild commented 5 years ago

So if X11() were to disable the pre-adjustment steps, then it would be performing the X-11 decomposition as you would expect it? What is the series being pre-adjusted for? Trading days and holiday effects?

The more I'm looking at this the more I think we should be treating X-11 and X-13ARIMA-SEATS as models with a components() method to extract the decomposition. Do you have any thoughts on this @robjhyndman ?

AQLT commented 5 years ago

With the default options, the series is pre-adjusted for trading days, outliers, gradual easter effects and a one-year forecast is done to reduce the revisions. Holidays are not adjusted by default because you have to specify your own calendar. You can also add your own regressors. The seasonal adjustment methods being highly sensitive to the presence of outliers, the goal of this step is to ensure a reliable estimation of the seasonal (and calendar) component.

In X-11 there was an optional pre-adjustment step (step A) but this was done with multiplicative coefficients. If you disable the pre-adjustment step you will have indeed something close to the X-11 decomposition. However this is not recommended because in X-11 there is an automatic algorithm to correct the irregular (from AO), and the Reg-ARIMA model of X-13 performs better also correcting from other kinds of outliers (level-shift, transitory change and optionally ramp effects and seasonal outliers). Note that this is also different from what was done in X-11ARIMA.

There is no better method between TRAMO-SEATS (not available in seasonal, TRAMO is similar to the pre-adjustment step of X-13-ARIMA-SEATS) and X-13ARIMA-SEATS with TRAMO or X-11 decomposition. They are all recommended to perform seasonal adjustment. That's why I think that SEATS and X-11 should be treated equally and the pre-adjustment step should be kept for both methods.

mitchelloharawild commented 5 years ago

So you're suggesting that X-11 and X-13ARIMA-SEATS be combined into the same function? How do you feel about the importance of the model/forecasts from this method?

AQLT commented 5 years ago

Yes I suggest either to have one function to perform the seasonal adjustment with X-13ARIMA-SEATS (eventually with a parameter to chose the decomposition method between SEATS or X-11) or to have two functions, like now, but with me same default parameters (no default method for the decomposition) and a similar documentation (saying that in both case X-13-ARIMA-SEATS is used). For this method, to add the Reg-ARIMA model was a huge improvement to adjust better from calendar effects and from outliers. Therefore the model is important, even with an automatic specification that can be improved (using a user-defined calendar, controlling outliers...). The forecast is also important because to have a stable estimation of the seasonally adjusted series of one month you need at least 6 years of data (3 years after and 3 years before): it helps to reduce the revisions in last observations.

mitchelloharawild commented 5 years ago

Great info, thanks. I don't think it's a good idea to have X11() actually use X-13ARIMA-SEATS, but rather X11() should represent the older (and less performant) X-11 method. Do you think it is possible to obtain a model similar to X-11 by constraining the arguments in seasonal::seas() to not support Reg-ARIMA and other improvements?

Also, would you consider X-13ARIMA-SEATS to be more of a decomposition method, or a model? Currently the implementation just obtains the decomposition and cannot be used to produce forecasts or look at model outputs. Treating it as a model would allow for this, but it would have to be held back until the second release.

AQLT commented 5 years ago

I've checked again the X-11 algorithm and your are not able to reproduce exactly the same "old X-11" with X-13ARIMA: there were several steps to trading days adjust the data during the decomposition that are now disables in X-13ARIMA (because this is done before the decomposition). So you are able to constrain the arguments in seas() to only perform the decomposition without the trading days adjustment (so not exactly the X-11 algorithm). However, I didn't understand why you want to reproduce the X-11 algorithm? Why not rather perform X-13ARIMA with the X-11 decomposition?

X-13ARIMA-SEATS is a rather a decomposition method than a model which goal is to estimate the seasonal component (so to extract a trend others decomposition methods have to be preferred). However, seasonal adjustment users are sometimes interesting in adding other time-series to the plots: the most common being the linearized series (= the pre-adjusted series), the trend and the forecasts. It was for example with the goal to add those informations that I've created ggdemetra (but it is not based on seasonal, I don't know how to extract all the informations with this package).

mitchelloharawild commented 3 years ago

I've now added X_13ARIMA_SEATS() (and learnt a lot more about the procedure since). The new function is a more direct wrapper around the U.S. Census interface, but still uses the {seasonal} package.

Thanks again for bringing these issues to my attention. I would appreciate if you could look over the new version of this function when convenient. Essentially the model formula now accepts the spec (or spc in {seasonal}) somewhat directly. For example, X_13ARIMA_SEATS(y~x11(seasonalma = "s3x9")).

The only interface difference I plan to make is allowing exogenous regressors for the regression{} and x11regression{} specs to be provided directly in the model formula. This isn't added yet, but I hope to finish that soon.

AQLT commented 3 years ago

I didn't have the time to test it, but it seems great, so as the documentation!

I was a bit surprised to see that there was a x11regression argument, in my previous answers I didn't though this option was still available (so you might be able reproduced the older X11, even it wouldn't be pertinent).

I might be wrong (I never used the Census software but another implementation, JDemetra+, developed by the Bank of Belgium and the Bundesbank) but I think this argument is here for a backward compatibility with older version of this seasonal adjustment method: indeed the parameter x11regression allows to perform the trading-day adjustment during the decomposition. However, it is now possible to do it with the regARIMA model during the pre-adjustment (before the decomposition) with regression and it seems that you cannot combined both procedure ("Irregular component regression and regARIMA model-based trading day adjustment cannot be specified in the same run.").

The regression parameter is much more general since you can define regressors with different effect (e.g. seasonal regressors), whereas in x11regression you only use trading day regressors (therefore the exogenous regressors would only have sense if they are trading day regressors).

For calendar adjustment, the best alternative seems to do it in the RegARIMA approach (regression) rather than on the provisional irregular component (x11regression), see for example the Eurostat guidelines on seasonal adjustment, available in this very big handbook on seasonal adjustment p801 : https://ec.europa.eu/eurostat/web/products-manuals-and-guidelines/-/KS-GQ-18-001.

To conclude, for non X-13 expert, it might be preferable to suggest to the user the use of regression rather than x11regression for the trading day adjustment and the use of exogenous regressors, adding for example a warning in the documentation of this parameter. What do you think?

On another subject, I'm really thinking on making my seasonal adjustment packages (based on JDemetra+) tsibble compatible (especially because in the near future it X-13 and TRAMO-SEATS implementation will be compatible with series of all frequencies), a bit like you have done with the seasonal package. Any advice to do so?

mitchelloharawild commented 3 years ago

Thanks for your comments, and apologies for the late reply.

The default use of regression terms is to use regression, unless x11regression is included in the model formula. Typically I expect users to provide exogenous regressors via X_13ARIMA_SEATS(y~x). If you think further commentary in the docs of x11regression for new X-13 users, I'd be happy to add it - although I expect new users would be using regression as it is the default behaviour.

The recommended approach for supporting tsibble for seasonal adjustment is to use the <tsibble> %>% model() %>% components() interface. Using the model specification approach allows seasonal adjustment to be combined with decomposition modelling and other analysis tools without additional effort. Some instruction on creating new models is given here: https://fabletools.tidyverts.org/articles/extension_models.html To support seasonal adjustment, you would need a model training method and components() method.

AQLT commented 3 years ago

Sorry for the late reply.

Ok I see, I think it is clear enough with your way to include exogenous regressors (but it might be complicated to specify the usertype with that formula).

Thanks for the tips !

I think we can close this issue now.

mitchelloharawild commented 3 years ago

Great, thanks for your comments and raising this issue! :+1: