As already mentioned, the idea and formalism -- Chen (1993) -- behind tsoutliers package is great. The downside is the seemingly ad-hoc solution.
Here is the comparison with other outlier detection ways. (For single output see here.)
Couple of issues:
By default, the data is modeled with forecast::auto.arima() which requires sequential and equally spaced points in time. For simplicity, data without timestamps was discarded, then ordered and subsequently assumed to be equally separated in time. #15 shows that the last point is not always the case, so this can be improved.
Some cases proved to be pathological. For example, DYLP127B_W1134 was not finished after more than 24h. Only data with less than 7000 points was taken into consideration.
Under ARIMA(0,0,0) the results should be similar (if not equal) to outliers v0.01 (cf. #2).
While many outliers seem logical, some do not. To make sense, better visualization of the ARIMA fit is needed.
As already mentioned, the idea and formalism -- Chen (1993) -- behind
tsoutliers
package is great. The downside is the seemingly ad-hoc solution.Here is the comparison with other outlier detection ways. (For single output see here.)
Couple of issues:
forecast::auto.arima()
which requires sequential and equally spaced points in time. For simplicity, data without timestamps was discarded, then ordered and subsequently assumed to be equally separated in time. #15 shows that the last point is not always the case, so this can be improved.DYLP127B_W1134
was not finished after more than 24h. Only data with less than 7000 points was taken into consideration.ARIMA(0,0,0)
the results should be similar (if not equal) to outliers v0.01 (cf. #2).While many outliers seem logical, some do not. To make sense, better visualization of the ARIMA fit is needed.
To be discussed.