Closed JRSlotman closed 3 years ago
ARMA doesn't work / isn't relevant for the fill_na_func. This is used for NAs within the data, i.e. mixing data of different frequency. E.g. what to fill for January and February for a quarterly series. These data aren't missing, just don't exist as the series is recorded at a different frequency. So any many-to-one transformation can be used here (mean, median, all set to the same number, etc.). The idea is to fill them with something that doesn't contain any information, yet still works in the mathematics of the network, that's why the mean works well. Let me know if you need more information on that.
Unfortunately yes ARMA filling is slow for ragged edges. That's partially an issue of optimization and partially an issue of necessity. At instantiation ARMA models have to be estimated for every series and their parameters saved (ARMA (1,1) or whatever, etc.), so that takes a while and can't be avoided. Later on in inference, ARMA models have to be fit again on each series to get the estimates for the ragged edges. This part could be optimized better to avoid restimating the models as much as possible.
Hopefully that helps explain things, let me know if you need anything else.
I tried running the fill_na_func with ARMA but I get the following error
ARMA works fine for fill_ragged_edge_func (albeit at a significantly higher run time) so the problem doesn't seem to be related to the call to Python.