time-series-foundation-models / lag-llama

Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting
Apache License 2.0
1.09k stars 121 forks source link

Custom dataset #4

Closed afshinebtia closed 2 months ago

afshinebtia commented 4 months ago

Hello,

Thanks for sharing your model. Could you please provide an example of how to use this outstanding model on a . CSV dataset containing time-series data?

kashif commented 4 months ago

We will add a csv example but for now you can follow the gluonts tutorial for reading such a dataset: https://ts.gluon.ai/stable/tutorials/data_manipulation/pandasdataframes.html

turkalpmd commented 4 months ago

Example full code here

ashok-arjun commented 4 months ago

Hi! Thank you for the issue, and thank you @turkalpmd for the efforts! We'll be adding a tutorial soon with options to load datasets in several formats (such as CSV). We appreciate your patience; thanks!

ashok-arjun commented 4 months ago

Hi @afshinebtia, we uploaded a new Colab demo with a tutorial to use a CSV dataset.

Please check it and let us know if your dataset fits into one of the categories explained there.

afshinebtia commented 4 months ago

Hi @ashok-arjun.

Your efforts and thoughtful considerations are deeply appreciated.

saikrishna-1996 commented 3 months ago

Hi @ashok-arjun , The Colab demo you shared converts it to PandasDataset but I couldn't use it in the fine_tuning Colab demo. Are there instructions somewhere to convert this PandasDataset to GluonTS format (or, directly from csv to GluonTS format) being used in the finetuning demo? Thanks!

ashok-arjun commented 3 months ago

Hi @saikrishna-1996, I expected the PandasDataset to work but didn't test with it. Apologies for the delay in releasing the full finetuning demo (I was on vacation and just got back, planning to release within 2 weeks - is that OK?).

ashok-arjun commented 2 months ago

Hi @saikrishna-1996 Can you share the error you get?

saikrishna-1996 commented 2 months ago

Hi @ashok-arjun , I don't exactly recall but I think the issue was not having an additional column for item_id (even though there is only one time series). Without this, dataset = PandasDataset.from_long_dataframe(df, target="target", item_id="item_id") wasn't working even when I remove the item_id="item_id". This is what I changed to make it work with my custom dataset:

url = (
    "https://gist.githubusercontent.com/rsnirwan/a8b424085c9f44ef2598da74ce43e7a3"
    "/raw/b6fdef21fe1f654787fa0493846c546b7f9c4df2/ts_long.csv"
)
df_ref = pd.read_csv(url, index_col=0, parse_dates=True)
df_new = pd.DataFrame(columns=df_ref.columns)
df = pd.read_parquet("myfile.parquet")
df_new['target'] = df['data']
df = df_new
df['item_id'] = 'A'
kashif commented 2 months ago

@saikrishna-1996 i belivie if you pull you should not see this issue

saikrishna-1996 commented 2 months ago

I see, thank you!

ashok-arjun commented 2 months ago

Closing this issue as it is resolved. Feel free to open it again if required :)