time-series-foundation-models / lag-llama

Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting
Apache License 2.0
1.09k stars 121 forks source link

Custom Data Usage From Pandas With Gluonts #10

Closed turkalpmd closed 2 months ago

turkalpmd commented 4 months ago

Enviroments

#!git clone https://github.com/time-series-foundation-models/lag-llama/
#cd lag-llama
#!pip install -r requirements.txt --quiet
#!huggingface-cli download time-series-foundation-models/Lag-Llama lag-llama.ckpt --local-dir /content/lag-llama

from gluonts.dataset.common import ListDataset
from itertools import islice
from matplotlib import pyplot as plt
import matplotlib.dates as mdates
import torch
from gluonts.evaluation import make_evaluation_predictions, Evaluator
from gluonts.dataset.repository.datasets import get_dataset
from lag_llama.gluon.estimator import LagLlamaEstimator
import json
import pandas as pd

Custom data preparation

url = '/content/last_data.csv'
data = pd.read_csv(url)
data['date'] = pd.to_datetime(data['date'], format='%Y-%m-%d %H:%M:%S')
data = data[['date', 'target']]
data.resample('H', on='date')['target'].sum().reset_index()
data = data.set_index('date').asfreq('1H')

context_length = 24
prediction_length = 24*7
freq = "1H"
time_series = data['target'].values
start = data.index[0]
train_ds = ListDataset(
                                                [{'target': time_series[:-prediction_length], 'start': start}],
                                                freq=freq
)

test_ds = ListDataset(
                                                [{'target': time_series, 'start': start}],
                                                freq=freq
)

Model training and evalution

ckpt = torch.load("lag-llama.ckpt", map_location=torch.device('cuda:0'))
estimator_args = ckpt["hyper_parameters"]["model_kwargs"]

estimator = LagLlamaEstimator(
                                ckpt_path="lag-llama.ckpt",
                                prediction_length=prediction_length,
                                context_length=context_length,
                                # estimator args
                                input_size=estimator_args["input_size"],
                                n_layer=estimator_args["n_layer"],
                                n_embd_per_head=estimator_args["n_embd_per_head"],
                                n_head=estimator_args["n_head"],
                                scaling=estimator_args["scaling"],
                                time_feat=estimator_args["time_feat"],
)

lightning_module = estimator.create_lightning_module()
transformation = estimator.create_transformation()
predictor = estimator.create_predictor(transformation, lightning_module)

forecast_it, ts_it = make_evaluation_predictions(
                                                                                    dataset=train_ds,
                                                                                    predictor=predictor,
                                                                                )

forecasts = list(forecast_it)
tss = list(ts_it)
evaluator = Evaluator()
agg_metrics, ts_metrics = evaluator(iter(tss), iter(forecasts))
print(json.dumps(agg_metrics, indent=4))

The results were so bizarre, I couldn't believe it. After 20-30 days on a project, I developed a forecaster model in just 1 minute that performed better than the last model I developed, which took 3 days with an RTX3090. I'm going to lose my mind, holly shit.

Old TCN model a9acc841-1670-4a71-a5a7-9b05e3b8b872 Log-Llama model 1790fc1c-671f-4752-9a0a-056c7f9b1d90

ashok-arjun commented 4 months ago

Thank you so much for the code + detailed explanation @turkalpmd ! All of us from the team deeply appreciate you sharing these positive results with us - we strive for exactly this kind of real-world impact.

We're releasing the fine-tuning scripts soon; I presume your results will be even better with that. Stay tuned, and thank you again!

kashif commented 4 months ago

also as an experiment try the estimator with the linear-rope positional embedding via:

estimator = LagLlamaEstimator(
    ckpt_path="lag-llama.ckpt",
    prediction_length=prediction_length,
    context_length=context_length,
    # estimator args
    input_size=estimator_args["input_size"],
    n_layer=estimator_args["n_layer"],
    n_embd_per_head=estimator_args["n_embd_per_head"],
    n_head=estimator_args["n_head"],
    scaling=estimator_args["scaling"],
    time_feat=estimator_args["time_feat"],
    rope_scaling={
        "type": "linear",
        "factor": max(
            1.0, (context_length + prediction_length) / estimator_args["context_length"]
        ),
    },
)
kashifmunircshs commented 4 months ago

@turkalpmd Thanks for sharing. I see the forecasts is always (100, prediction_length). Does this mean it gives predictions for next 100 prediction lengths?

Second. This uses the train_ds and predicts for last prediction_length sequences. How to enable it to do predictions beyond that? Thanks in advance.

turkalpmd commented 4 months ago

No, it's not 100 points. My own custom data was hourly, which is why I asked for a weekly forecast from the 24 intervals of a day. I didn't apply any Benchmark tests on this model, I only compared its forecasts with another Transformer time series model, which had previously taken 3 days to train with a 24 GB RTX3090. Therefore, as far as I understand, the following codes seem sufficient to set the intervals;

window and forecasting horizon:

context_length = 24
prediction_length = 24*7

Splitting data


train_ds = ListDataset(
    [{'target': time_series[:-prediction_length], 'start': start}],
    freq=freq
)

test_ds = ListDataset(
    [{'target': time_series, 'start': start}],
    freq=freq
)
ashok-arjun commented 4 months ago

@kashifmunircshs 100 represent the number of samples sampled from the probability distribution for each timestep (since we're a probabilistic forecasting model).

FYI we uploaded a new Colab demo with a tutorial to use a CSV dataset.

We explain the dimensions of the forecasts tensor there.

warner83 commented 4 months ago

@turkalpmd can you post the code you used to plot the graphs you reported? Thanks

ashok-arjun commented 4 months ago

@warner83 The code for plotting is in the Colab

ashok-arjun commented 4 months ago

No, it's not 100 points. My own custom data was hourly, which is why I asked for a weekly forecast from the 24 intervals of a day. I didn't apply any Benchmark tests on this model, I only compared its forecasts with another Transformer time series model, which had previously taken 3 days to train with a 24 GB RTX3090. Therefore, as far as I understand, the following codes seem sufficient to set the intervals;

window and forecasting horizon:

context_length = 24
prediction_length = 24*7

Splitting data

train_ds = ListDataset(
    [{'target': time_series[:-prediction_length], 'start': start}],
    freq=freq
)

test_ds = ListDataset(
    [{'target': time_series, 'start': start}],
    freq=freq
)

So just FYI the context length is best kept the same (32); the code will work if changed but I suggest trying 32 first before the others

wyw862788 commented 2 months ago

@turkalpmd
Could you please send the last_data.csv file to my email at 3479540754@qq.com? Thank you very much.

naveenfaclon commented 1 month ago

in my custom dataset i have two columns only Time and values so how to go ahead with this

`for col in df.columns: if df[col].dtype != 'object' and pd.api.types.is_string_dtype(df[col]) == False: df[col] = df[col].astype('float32') df['item_id'] = 'A'

dataset = PandasDataset.from_long_dataframe(df, target = 'Inverter 1 Active Energy (D1)', item_id = 'time')

backtest_dataset = dataset prediction_length = 24 num_samples = 100`

facing item_id error