RaivoKoot / Video-Dataset-Loading-Pytorch

Generic PyTorch dataset implementation to load and augment VIDEOS for deep learning training loops.
BSD 2-Clause "Simplified" License
447 stars 43 forks source link

Support iterable / webdataset #23

Open richardrl opened 6 months ago

richardrl commented 6 months ago

What would be required to make this work for very large video sets? Would it be some integration with web dataset?

RaivoKoot commented 6 months ago

Depends how large. in general your dataset can be as big as you want. I've used the code with a video dataset where the videos totalled 1.5 terrabyte. However, those 1.5 terrabyte did fit on my local disk. Once your dataset is too big to fit on a single machine or if you want to do multi-node training you would have to make some changes to the code to support this somehow.