lhotse-speech / lhotse

Tools for handling speech data in machine learning projects.
https://lhotse.readthedocs.io/en/latest/
Apache License 2.0
957 stars 220 forks source link

Utility for prefetching data on GPUs #243

Open pzelasko opened 3 years ago

pzelasko commented 3 years ago

My understanding is that with PyTorch's DataLoader, we do the following

asynchronously:

synchronously:

I think we can add a class that wraps the DataLoader (FastDataLoader?) and performs both transfer in the background to further speed up the training. There seems to be a good example here: https://github.com/NVIDIA/apex/blob/master/examples/imagenet/main_amp.py#L265

Another good one (but doesn't address the CPU -> GPU transfer) is in torchaudio: https://github.com/pytorch/audio/blob/master/torchaudio/datasets/utils.py#L276

pzelasko commented 3 years ago

... unless I'm missing something - to me it seems like this sort of functionality should be a standard pytorch component?

danpovey commented 3 years ago

I suppose so... I don't know whether that transfer is really going to be a limiting factor, but if it doesn't make the code significantly harder to understand I suppose it's a nice thing to have.

On Tue, Mar 23, 2021 at 11:49 PM Piotr Żelasko @.***> wrote:

... unless I'm missing something - to me it seems like this sort of functionality should be a standard pytorch component?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/lhotse-speech/lhotse/issues/243#issuecomment-805014712, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZFLO55BI7JCOYX2VB2EV3TFC2B7ANCNFSM4ZVOMNOQ .

csukuangfj commented 3 years ago

If I remember correctly, DataLoader in PyTorch does not support GPU when num_workers is greater than one.

fanlu commented 3 years ago

@pzelasko Is there a way to store wavefoem in hdf5 format, then we can load data fast and do on-the-fly augmentation?

pzelasko commented 3 years ago

Yeah, it is definitely doable. I don’t know how much faster it would be, but I guess it won’t require to open/close new files all the time.

As a “quick” workaround you can try increasing the number of dataloader workers.