NVIDIA / aistore

AIStore: scalable storage for AI applications
https://aistore.nvidia.com
MIT License
1.23k stars 164 forks source link

Retrieval Cost for AIS with GCP / AWS #112

Closed HHenryD closed 1 year ago

HHenryD commented 1 year ago

I am interested in using PT+AIS to iterate over Google Cloud buckets using the Iterable DataPipes API in PT 1.12, and was wondering if there were retrieval costs in using AIS. Aside from paying for the storage, how cost-friendly would it be to use AIS+Google Cloud (or other services like Webdataset) for training over long durations?

gaikwadabhishek commented 1 year ago

Assuming you will use AISFileLister and loads AISFileLoader.

With AIS in front of your Cloud bucket(s), you save both cost and time as AIS issues cold GET requests only once on a per-object basis. Multiple alternative data pre-loading mechanisms are also supported. Users can store data according to the per-bucket configurable policies (erasure coding, LRU, etc.).

Screen Shot 2022-09-23 at 11 57 36 AM