Nixtla / statsforecast

Lightning ⚡️ fast forecasting with statistical and econometric models.
https://nixtlaverse.nixtla.io/statsforecast
Apache License 2.0
3.81k stars 263 forks source link

MSTL is very slow and not using all CPU cores at all #851

Closed VictorDonjuan closed 2 months ago

VictorDonjuan commented 2 months ago

What happened + What you expected to happen

My time series comes in hourly format and it spans several years. It has 149,016 rows in total.

The below code was running for 30 minutes until I interrupted it to see if something was wrong with the parameters. I also tried n_jobs=-1 but it didn't seem to improve.

image

It seems the fitting process is not using all my cores at all. In contrast with sklearn or joblib, when I use the maximum n_jobs in some tasks, I can clearly see my CPU usage peaking.

I know the season lenghts and the size of the data matters, but I still think this shouldn't be that slow.

Versions / Dependencies

statsforecast==1.7.5

Reproduction script

from statsforecast import StatsForecast
from statsforecast.models import MSTL, AutoARIMA

models = [
    MSTL(
        season_length=[24, 24*7, int(24*365.25)], trend_forecaster=AutoARIMA()
    )
]

sf = StatsForecast(models=models,
                   freq='H', 
                   n_jobs=12)

sf.fit(df)

Issue Severity

Low: It annoys or frustrates me.

jmoralez commented 2 months ago

Hey. The parallelism is done by series, so if you just have one it won't provide any speedup. You have a lot of data relative to the seasonal periods and you could most likely limit it to maybe 10 * max(season_length) and it'd yield a similar result to using all of the data (unless you want fitted values).

VictorDonjuan commented 2 months ago

Hey. The parallelism is done by series, so if you just have one it won't provide any speedup. You have a lot of data relative to the seasonal periods and you could most likely limit it to maybe 10 * max(season_length) and it'd yield a similar result to using all of the data (unless you want fitted values).

Thank you, sorry for the misunderstanding.