ibm-granite / granite-tsfm

Foundation Models for Time Series
Apache License 2.0
285 stars 152 forks source link

How to use it for inference ? #46

Open CrasCris opened 1 month ago

CrasCris commented 1 month ago

i already do finetune_forecast_trainer.evaluate(test_dataset)

but how i use it to make inferention ?

wgifford commented 1 month ago

Try: finetune_forecast_trainer.predict(test_dataset)

I believe the first element of the result should be the predictions.

CrasCris commented 1 month ago

Try: finetune_forecast_trainer.predict(test_dataset)

I believe the first element of the result should be the predictions.

Thanks, if i want to use the model for inferention, do i need to convert like TimeSeriesPreprocessor ? to use it ?

wgifford commented 1 month ago

@CrasCris The basic process is as follows:

# define train and test data
train_data = # pandas dataframe
test_data = # pandas dataframe

# define a preprocessor
tsp = TimeSeriesPreprocessor(
        timestamp_column=...
        scaling=True,
    )

# train the preprocessor
tsp.train(train_data)

# define a forecasting pipeline
forecast_pipeline = TimeSeriesForecastingPipeline(
    model=model,
    timestamp_column=...
    feature_extractor=tsp, # note, needed if we want to inverse scale internally
    inverse_scale_outputs=True,
)

# forecast using the pipeline on the scaled test data
forecasts = forecast_pipeline(tsp.preprocess(test_data))

You can find a similar example in the tests: https://github.com/ibm-granite/granite-tsfm/blob/main/tests/toolkit/test_time_series_forecasting_pipeline.py#L113

matsuobasho commented 1 month ago

The difference between this modeling approach using TimeSeriesForecastingPipeline and the method outlined in the ttm tutorial notebook is confusing. What's the difference?

The approach in in the Zeroshot evaluation section of the ttm notebook ran immediately whereas this one has been running for 5 minutes on my machine.

wgifford commented 1 month ago

@matsuobasho In the notebook you are passing in a pytorch dataset and a tensor is returned. TimeSeriesForecastingPipeline is meant to work with a pandas dataframe, embed more of the steps, and return a pandas dataframe.

If you can share your code for zeroshot evaluation using the pipeline, I will take a look.

matsuobasho commented 1 month ago

@wgifford , thanks for the offer.

# Forecasting parameters
context_length = 512
forecast_length = 96
fewshot_fraction = 0.05

timestamp_column = "Timestamp"
id_columns = []
target_columns = ["y"]
split_config = {
                'train': 0.7,
                'test': 0.2
                }

column_specifiers = {
    "timestamp_column": timestamp_column,
    "id_columns": id_columns,
    "target_columns": target_columns,
    "control_columns": [],
}

tsp = TimeSeriesPreprocessor(
    **column_specifiers,
    context_length=context_length,
    prediction_length=forecast_length,
    scaling=True,
    encode_categorical=False,
    scaler_type="standard",
)

train_dataset, valid_dataset, test_dataset = tsp.get_datasets(
    df, split_config, fewshot_fraction=fewshot_fraction, fewshot_location="first"
)

TTM_MODEL_REVISION = "main"
zeroshot_model = TinyTimeMixerForPrediction.from_pretrained("ibm/TTM", revision=TTM_MODEL_REVISION)

from tsfm_public.toolkit.time_series_forecasting_pipeline import TimeSeriesForecastingPipeline
tsp.train(train_dataset)

# define a forecasting pipeline
forecast_pipeline = TimeSeriesForecastingPipeline(
        model=zeroshot_model,
        timestamp_column=timestamp_column,
        id_columns=id_columns,
        target_columns=target_columns,
        feature_extractor=tsp,
        explode_forecasts=False,
        inverse_scale_outputs=True,
    )

# forecast using the pipeline on the scaled test data
forecasts = forecast_pipeline(tsp.preprocess(test_dataset))

I now get an error on the tsp.train step. AttributeError: 'ForecastDFDataset' object has no attribute 'copy'

I think I see the issue, since as you said TimeSeriesForecastingPipeline expects a dataframe, not a time series object.