NVIDIA / aistore

AIStore: scalable storage for AI applications
https://aiatscale.org
MIT License
1.17k stars 153 forks source link

Questions on the ETL tutorial example #144

Closed Hastyrush closed 11 months ago

Hastyrush commented 11 months ago

Hi,

I was following the 3-part tutorial posted at https://aiatscale.org/blog/2023/05/05/aisio-transforms-with-webdataset-pt-1

My question is, was the example designed to work with Kubernetes only? As I tried running a local single node cluster using Minikube as documented on the deployment documentation (https://github.com/NVIDIA/aistore/blob/master/deploy/dev/k8s/README.md). This means that the ETL that was supposed to be performed at the storage cluster's compute is now happening on the local machine used for data fetching as well.

The result is that on calling batch = next(iter(dataloader)) as written in https://aiatscale.org/blog/2023/06/09/aisio-transforms-with-webdataset-pt-3, the pipeline runs extremely slow when fetching the batches, possibly due to CPU resource contention from the ETL processing and the data fetching pipeline?

This does not happen when the ETL is removed when creating the webdataset.

Thanks!

aaronnw commented 11 months ago

Hi!

Yes, the ETL functionality requires kubernetes deployments. This is because the transformer process is deployed as a separate pod within the k8s cluster.

There is really only a performance benefit to ETL when the cluster is remote. The performance improvement there is that the compute is local to the data and isn't using up any local resources, instead using the typically idle storage cluster compute. So it will work with a local minikube, but it does not really optimize anything.

When fetching the batch, because this ETL transforms an entire shard, you might see some slowness. This will depend on a lot of things though:

Hastyrush commented 11 months ago

Hi Aaron,

Thanks for the clarification! It helped a lot in understanding the requirement of Kubernetes in a seperate storage compute cluster. Will be closing this issue!