Training lag-llama with GluonTS dataset reporting error

CXL-edu commented 4 months ago

The complete code on Colab:

Using data from Case 1 in GluonTS examples：

https://ts.gluon.ai/stable/tutorials/data_manipulation/pandasdataframes.html#Use-case-1---Loading-data-from-a-long-dataframe

Error： lag-llama

CXL-edu commented 4 months ago

In addition, my dataset contains multiple individual time series, and I have performed the following operations based on GlounTS case2. If the above-mentioned issue is resolved, will the input data obtained in this way in lag-llama, PandasDataset.from_long_dataframe(df, target="target", item_id="item_id"), still be incompatible with the data structure in the example PandasDataset(data_gl, target="target")?

此外，我的数据集包含多个单独的时序，我根据GlounTS的case2进行如下操作。如果上述问题解决后，由于lag-llama中是使用此方式得到的输入数据PandasDataset.from_long_dataframe(df, target="target", item_id="item_id")，是否仍然会与示例中的PandasDataset(data_gl, target="target")数据结构不兼容？

import numpy as np
import pandas as pd
from gluonts.dataset.pandas import PandasDataset

def read_data4gl(data_name, time_freq) -> dict[pandas.DataFrame]:
    data = np.random.randint(1, 10, (10, 4))
    data = pd.DataFrame(data, columns=['cell'+str(id) for id in range(1,5)])  # if my data is 10 samples from 4 battery
    data['index'] = pd.date_range(start="2020-01-01", periods=data.shape[0], freq=time_freq)
    data = data.set_index('index').rename_axis(None)

    data_gl = {}

    col_names = data.columns.values
    for col in col_names:
        data_temp = pd.DataFrame(data[col].dropna())
        data_temp.columns.values[0] = 'target'
        # data_temp['item_id'] = col
        data_gl[col] = data_temp

    return data_gl

data_gl = read_data4gl('data_name', '1h')
data_gl = PandasDataset(data_gl, target="target")
print(data_gl )

kashif commented 3 months ago

@CXL-edu should be fixed now... can you try to pull and retrain?

CXL-edu commented 3 months ago

When running Colab, the encountered error is as follows: TypeError: model must be a LightningModule or torch._dynamo.OptimizedModule, got LagLlamaLightningModule

error

If you could provide a refined and complete example on Colab, I would greatly appreciate it.

kashif commented 3 months ago

@CXL-edu check here: https://colab.research.google.com/drive/1uvTmh-pe1zO5TeaaRVDdoEWJ5dFDI-pA?usp=sharing

CXL-edu commented 3 months ago

Thank you for the code. I am having trouble applying it to my data. The code uses the from gluonts.dataset.repository.datasets import get_dataset method to get data, such as dataset = get_dataset("m4_weekly"). How can I convert the data in pd.DataFrame format to the target format? I tried using the ds = PandasDataset.from_long_dataframe(df, target="target", item_id="item_id") method, but it didn't work properly.

kashif commented 3 months ago

@CXL-edu the gluonts dataformat is quite simple, its is essentially a list of dicts, where each dict needs to have the "Target" and "start" key and the other covariates are optional. So if you make a list of these dicts you can initialize a gluonts dataset via their ListDataset class...

you can also look at any of the datasets from the get_dataset to get an idea of the structure of the data

This format is nice (at least for regular time series) as it avoids the unnecessary copy of datetimes and allows for incorporating real-values/catagorical covariates that can be either static or dynamic

smrnvdhy commented 3 months ago

Edit: the error I got is unrelated, but I think dynamic/static features are not yet supported: the create_transformation function only grabs the target and time features. https://github.com/time-series-foundation-models/lag-llama/blob/75cc6a6c23a32f864065e3dd31f2f506ccc90998/lag_llama/gluon/estimator.py#L239-L266

@kashif I'd like to follow up on this. I'm exploring finetuning with dynamic/static features with the collab notebook. Started with toy data from gluon here https://ts.gluon.ai/stable/tutorials/data_manipulation/pandasdataframes.html#Include-static-and-dynamic-features

Ran into dtype error while checking zero-shot, so I didn't get a chance to try finetuning yet:

ashok-arjun commented 3 months ago

@smrnvdhy Yes, the current model doesn't support those features.

ashok-arjun commented 2 months ago

Hi @CXL-edu Is this resolved?

ashok-arjun commented 2 months ago

Closing this issue. Feel free to open it again if required :)

time-series-foundation-models / lag-llama

Training lag-llama with GluonTS dataset reporting error #22