Closed elgalu closed 2 years ago
We recently pushed our own custom loaders into PyTorch's DataPipes package, see https://pytorch.org/data/main/generated/torchdata.datapipes.iter.AISFileLoader.html#torchdata.datapipes.iter.AISFileLoader and https://pytorch.org/data/main/generated/torchdata.datapipes.iter.AISFileLister.html#torchdata.datapipes.iter.AISFileLister.
@gaikwadabhishek @alex-aizman I think we should update README.md
section to mention that.
Hey @elgalu , These are just different things. We are still developing connectors (plugins) for PyTorch with AIStore which @VirrageS mentioned above. WebDataset has its own advantages.
If you want to load data from remote cloud backends you can try aisio.py.
It works very similar to that of s3io.py. The advantages of aisio
over s3io
-
Example of AIStore Iterable Datapipe: https://aiatscale.org/blog/2022/07/12/aisio-pytorch
You may want to update this section, PyTorch >= 1.12 can load data directly from an S3-compatible store using https://github.com/aws/aws-sdk-cpp
https://github.com/NVIDIA/aistore/blob/cb8798307c906d730b23e3437a63e65a9b5da570/README.md?plain=1#L60-L62