Closed CXL-edu closed 2 months ago
In addition, my dataset contains multiple individual time series, and I have performed the following operations based on GlounTS case2. If the above-mentioned issue is resolved, will the input data obtained in this way in lag-llama, PandasDataset.from_long_dataframe(df, target="target", item_id="item_id")
, still be incompatible with the data structure in the example PandasDataset(data_gl, target="target")
?
此外,我的数据集包含多个单独的时序,我根据GlounTS的case2进行如下操作。
如果上述问题解决后,由于lag-llama中是使用此方式得到的输入数据PandasDataset.from_long_dataframe(df, target="target", item_id="item_id")
,是否仍然会与示例中的PandasDataset(data_gl, target="target")
数据结构不兼容?
import numpy as np
import pandas as pd
from gluonts.dataset.pandas import PandasDataset
def read_data4gl(data_name, time_freq) -> dict[pandas.DataFrame]:
data = np.random.randint(1, 10, (10, 4))
data = pd.DataFrame(data, columns=['cell'+str(id) for id in range(1,5)]) # if my data is 10 samples from 4 battery
data['index'] = pd.date_range(start="2020-01-01", periods=data.shape[0], freq=time_freq)
data = data.set_index('index').rename_axis(None)
data_gl = {}
col_names = data.columns.values
for col in col_names:
data_temp = pd.DataFrame(data[col].dropna())
data_temp.columns.values[0] = 'target'
# data_temp['item_id'] = col
data_gl[col] = data_temp
return data_gl
data_gl = read_data4gl('data_name', '1h')
data_gl = PandasDataset(data_gl, target="target")
print(data_gl )
@CXL-edu should be fixed now... can you try to pull and retrain?
When running Colab
, the encountered error is as follows:
TypeError: model
must be a LightningModule
or torch._dynamo.OptimizedModule
, got LagLlamaLightningModule
If you could provide a refined and complete example on Colab, I would greatly appreciate it.
Thank you for the code. I am having trouble applying it to my data. The code uses the from gluonts.dataset.repository.datasets import get_dataset
method to get data, such as dataset = get_dataset("m4_weekly")
. How can I convert the data in pd.DataFrame
format to the target format? I tried using the ds = PandasDataset.from_long_dataframe(df, target="target", item_id="item_id")
method, but it didn't work properly.
@CXL-edu the gluonts dataformat is quite simple, its is essentially a list of dicts, where each dict needs to have the "Target" and "start" key and the other covariates are optional. So if you make a list of these dicts you can initialize a gluonts dataset via their ListDataset
class...
you can also look at any of the datasets from the get_dataset
to get an idea of the structure of the data
This format is nice (at least for regular time series) as it avoids the unnecessary copy of datetimes and allows for incorporating real-values/catagorical covariates that can be either static or dynamic
Edit: the error I got is unrelated, but I think dynamic/static features are not yet supported: the create_transformation
function only grabs the target and time features. https://github.com/time-series-foundation-models/lag-llama/blob/75cc6a6c23a32f864065e3dd31f2f506ccc90998/lag_llama/gluon/estimator.py#L239-L266
@kashif I'd like to follow up on this. I'm exploring finetuning with dynamic/static features with the collab notebook. Started with toy data from gluon here https://ts.gluon.ai/stable/tutorials/data_manipulation/pandasdataframes.html#Include-static-and-dynamic-features
Ran into dtype error while checking zero-shot, so I didn't get a chance to try finetuning yet:
@smrnvdhy Yes, the current model doesn't support those features.
Hi @CXL-edu Is this resolved?
Closing this issue. Feel free to open it again if required :)
The complete code on Colab:
Using data from Case 1 in GluonTS examples:
https://ts.gluon.ai/stable/tutorials/data_manipulation/pandasdataframes.html#Use-case-1---Loading-data-from-a-long-dataframe
Error:![lag-llama](https://github.com/time-series-foundation-models/lag-llama/assets/60612507/6db4ca49-69bc-48fd-858b-cfbfaf4d06e3)