Open czmrand opened 2 years ago
You might want to take a look at the project called torchx in https://pytorch.org/torchx/latest/examples_apps/index.html. This might provide you the required feature.
Transferred this issue, as we definitely not going to implement it as a classic DataLoader feature, but looking to have it in DLv2.
🚀 The feature, motivation and pitch
Occasionally one might find that their GPU is idle due to a bottleneck on the input data pre-processing pipeline (which might include data loading/filtering/manipulation/augmentation/etc). In these cases one could improve resource utilization by offloading some of the pre-processing to auxiliary CPU devices. I have demonstrated how to do this using gRPC in the following blog post: https://towardsdatascience.com/overcoming-ml-data-preprocessing-bottlenecks-with-grpc-ca30fdc01bee
TensorFlow has built in (experimental) support for this feature (https://www.tensorflow.org/api_docs/python/tf/data/experimental/service) that enables offloading in a few simple steps.
The request here is to include PyTorch APIs for offloading data pre-processing in a manner that would be simple and straight forward to the user... Similar to the TensorFlow APIs (though preferably without any limitations on pre-processing workload) .
Alternatives
No response
Additional context
No response
cc @SsnL @VitalyFedyunin @ejguan @NivekT