Nixtla / mlforecast

Scalable machine 🤖 learning for time series forecasting.
https://nixtlaverse.nixtla.io/mlforecast
Apache License 2.0
789 stars 74 forks source link

[core] speed up date features calculation #337

Closed jmoralez closed 2 months ago

jmoralez commented 2 months ago

Currently the date features can take a very long time. For example, in the M5 (10 million rows after dropping NaNs) it takes around 10 seconds to compute 6 features, this is a lot if we compare it to the 500ms it takes to compute the 21 lag features.

We should try to look for alternatives to generate those features faster (ideally with multithreading), maybe implementing some functions in coreforecast that take a np.datetime64 array and return an array with several features.

For polars this is also currently inefficient because we could parallelize the creation if we wrapped them all in a single select, but we're computing them sequentially at the moment https://github.com/Nixtla/mlforecast/blob/8432fc46687641378dfebdc1cfe8f5518715b6d2/mlforecast/core.py#L430-L435