Nixtla / utilsforecast

https://nixtlaverse.nixtla.io/utilsforecast
Apache License 2.0
42 stars 7 forks source link

Pylance warning: Cannot access attribute "loc" and "groupby" for class "pl_DataFrame" in evaluate method #115

Closed Harikapl closed 2 months ago

Harikapl commented 2 months ago

I'm encountering a Pylance warning when using the evaluate method from the utilsforecast package. The warning suggests that the method is treating my Pandas DataFrame as a Polars DataFrame (pl_DataFrame), even though I'm not using Polars anywhere in my code.

Warning Details:

image

Sample code to reproduce:

import numpy as np
import pandas as pd
from sktime.split import temporal_train_test_split
from utilsforecast.evaluation import evaluate
from utilsforecast.losses import mape, smape

np.random.seed(42)
ids = np.arange(1, 101)  # Unique IDs
dates = pd.date_range(start="2023-01-01", periods=100, freq="D")
data = {
    "unique_id": ids,
    "ds": dates,
    "y": np.random.randn(100).cumsum(),  # Cumulative sum to simulate a time series
}
df = pd.DataFrame(data)

def naive_forecast(df):
    df["yhat"] = df["y"].shift(1)
    return df
y_train, y_test = temporal_train_test_split(df, test_size=12)  # type: ignore

forecast_df = naive_forecast(y_train.copy())

evaluation_result = evaluate(forecast_df, metrics=[mape, smape], train_df=y_train)
print(evaluation_result)

id_51 = evaluation_result.loc[evaluation_result["unique_id"] == 51]
average_metrics = evaluation_result.groupby("metric")["yhat"].mean()

Expected Behavior:

The method should work without any Pylance warnings since only Pandas DataFrames are used.

Additional Info: I took a quick look at the code and noticed that the evaluate method seems to be designed to handle both Pandas and Polars DataFrames. However, even though I'm using Pandas, it seems like the method might be inferring a Polars DataFrame (pl_DataFrame), which is likely what's causing these warnings.

jmoralez commented 2 months ago

Hey. I believe the problem is that the return type is an union, so just adding assert isinstance(evaluation_result, pd.DataFrame) should remove that warning.

jmoralez commented 2 months ago

By the way, we're in the process of changing most of these to use TypeVar, which should solve your issue as well.

jmoralez commented 2 months ago

This should be fixed by upgrading the package. Feel free to reopen if the issue persists