alteryx / evalml

EvalML is an AutoML library written in python.
https://evalml.alteryx.com
BSD 3-Clause "New" or "Revised" License
753 stars 85 forks source link

Predict with Time series regression #2274

Open fernaper opened 3 years ago

fernaper commented 3 years ago

Hello,

I'm training a Time series regression model and I have a problem when predicting.

image

This error appears because I'm not passing the target on the y parameter. But how I can pass the target for a future value?

I also tried modifying your predict method with the next code:

import pandas as pd

from evalml.pipelines.pipeline_meta import TimeSeriesPipelineBaseMeta
from evalml.pipelines.regression_pipeline import RegressionPipeline
from evalml.problem_types import ProblemTypes
from evalml.utils import (
    _convert_woodwork_types_wrapper,
    drop_rows_with_nans,
    infer_feature_types,
    pad_with_nans
)
from types import MethodType

def predict(self, X, y=None, objective=None):
    """Make predictions using selected features.
    Arguments:
        X (ww.DataTable, pd.DataFrame, or np.ndarray): Data of shape [n_samples, n_features]
        y (ww.DataColumn, pd.Series, np.ndarray, None): The target training targets of length [n_samples]
        objective (Object or string): The objective to use to make predictions
    Returns:
        ww.DataColumn: Predicted values.
    """
    if X is None:
        X = pd.DataFrame()
    X = infer_feature_types(X)
    X = _convert_woodwork_types_wrapper(X.to_dataframe())
    if y is not None:
        y = infer_feature_types(y)
        y = _convert_woodwork_types_wrapper(y.to_series())
    features = self.compute_estimator_features(X, y)
    features = _convert_woodwork_types_wrapper(features.to_dataframe())
    features_no_nan, y = drop_rows_with_nans(features, y)
    y_arg = None
    if self.estimator.predict_uses_y:
        y_arg = y
    predictions = self.estimator.predict(features_no_nan, y_arg).to_series()
    predictions = predictions.rename(self.input_target_name)
    padded = pad_with_nans(predictions, max(0, features.shape[0] - predictions.shape[0]))
    return infer_feature_types(padded)

best_pipeline.predict = MethodType(predict, best_pipeline)

And then the error is the next one: image

Do you have any idea what I am doing wrong? How can a predict for future values?

Thank you very much

dsherry commented 3 years ago

Thank you for filing @fernaper ! Our support for time series is currently in "beta" status, so we appreciate the feedback. I agree our API shouldn't expect you to pass in future target values that you couldn't know. We'll take a look at this.