Nixtla / mlforecast

Scalable machine 🤖 learning for time series forecasting.
https://nixtlaverse.nixtla.io/mlforecast
Apache License 2.0
808 stars 75 forks source link

MLforecast does not work with with PyArrow dates #305

Closed nprihodko closed 5 months ago

nprihodko commented 5 months ago

What happened + What you expected to happen

When the date column in the data frame supplied to MLForecast is PyArrow data type, it does not work. MLForecast throws error:

ValueError: The time column ('ds') should have either timestamps or integers, got 'object'.

I would expect either

Versions / Dependencies

Reproduction script

import lightgbm as lgb
from mlforecast import MLForecast
from mlforecast.utils import generate_daily_series

# Generate some data
df = generate_daily_series(
    n_series=20,
    max_length=100,
    n_static_features=0,
    static_as_categorical=False,
    with_trend=True,
)

# Convert to PyArrow dtypes
df = df.astype(
    {
        "ds": "timestamp[ns][pyarrow]",
        "unique_id": "string[pyarrow]",
        "y": "float64[pyarrow]",
    }
)
print(df.dtypes)

# Create MLForecast object 
models = [lgb.LGBMRegressor(verbosity=-1)]
fcst = MLForecast(models=models, freq="D", lags=[7, 14])

# Fit
fcst.fit(df)

Issue Severity

Medium: It is a significant difficulty but I can work around it.

jmoralez commented 5 months ago

Hey @nprihodko, thanks for using mlforecast and for the detailed report. This should be fixed in utilsforecast 0.0.26. Can you please upgrade to that version (currently only in PyPI) and verify?

nprihodko commented 5 months ago

@jmoralez, this seems to fix the issue. Thank you!