Open KishManani opened 2 years ago
Hi @KishManani, if I may add a small note, I recently looked into implementing a Python equivalent of tsoutliers
.
For that purpose I did some testing with supersmoother
and the results were not an exact match of supsmu
on which the R version of MSTL is based on.
That was especially evident for short time series while for very short time series (n<100) supersmoother
seemed even to struggle to fit at all.
Finally, I briefly looked into Statsmodels lowess
as an alternative too but I found it hard to tune, at least for the anomaly detection use case (not trivial to land on a frac
value that would work well for both short and long time series).
TLDR: supersmoother
does the job as long as (a) not working with short time series and (b) not looking to replicate supsmu
We can't take a dependency, but the code could be brought in. I suspect that the code in that package should be Cythonized for performance, but haven't looked closely at how it is implemented.
It is likely that supsmu has some tuning parameters that vary for small sample sizes. It may be the case that these could be reverse engineered, especially if the performance is not so good.
Is your feature request related to a problem? Please describe
Currently, the version of MSTL implemented in Statsmodels does not replicate the same behaviour as the original algorithm when a user specifies no seasonal components. When there is no seasonal component, the original algorithm uses Friedman's super smoother to just extract a trend. To reflect the method as described in the paper it would be good to add this behaviour to the version of MSTL in Statsmodels. One issue is that Friedman's super smoother is not implemented in Statsmodels.
Describe the solution you'd like
Friedman's supersmoother has an implementation in Python here. One solution is to use this in the MSTL implementation. However, I'm not sure whether it's actively maintained and whether it would be acceptable to introduce additional dependencies to Statsmodels.
Describe alternatives you have considered
An alternative would be to re-write a version of Friedman's supersmoother in Statsmodels. Another solution would be to use LOWESS as an alternative method to extract the trend in the short term until a version of Friedman's supersmoother becomes available in Statsmodels.
Additional context
An implementation in R also exists.
I'm happy to take this on, but I'd like any guidance from the maintainers about what path to take here.