Currently the date features can take a very long time. For example, in the M5 (10 million rows after dropping NaNs) it takes around 10 seconds to compute 6 features, this is a lot if we compare it to the 500ms it takes to compute the 21 lag features.
We should try to look for alternatives to generate those features faster (ideally with multithreading), maybe implementing some functions in coreforecast that take a np.datetime64 array and return an array with several features.
Currently the date features can take a very long time. For example, in the M5 (10 million rows after dropping NaNs) it takes around 10 seconds to compute 6 features, this is a lot if we compare it to the 500ms it takes to compute the 21 lag features.
We should try to look for alternatives to generate those features faster (ideally with multithreading), maybe implementing some functions in coreforecast that take a
np.datetime64
array and return an array with several features.For polars this is also currently inefficient because we could parallelize the creation if we wrapped them all in a single select, but we're computing them sequentially at the moment https://github.com/Nixtla/mlforecast/blob/8432fc46687641378dfebdc1cfe8f5518715b6d2/mlforecast/core.py#L430-L435