Open nimaous opened 8 months ago
It seems that TorchData is no longer under active development. Not sure if we have plans to integrate with it on our side.
If you can load data stored in the parquet files into a Pandas Dataframe, you can create a DataLoader using torch_frame.data.DataLoader
by directly supplying the dataframe as the dataset
argument. However, pandas DataFrame can be memory intensive. So you might run into issues with large datasets.
We do welcome community contribution.
Hi all,
In my project, I use TorchData to read parquet files from AWS S3 buckets. Currently, it seems that pytorch-frame can not be integrated with torchdata. I was wondering if you have any plans to make it possible or if you have any workaround solution to read parquets files from S3 buckets using torchframe dataset?
Thanks,