Open pzelasko opened 3 years ago
... unless I'm missing something - to me it seems like this sort of functionality should be a standard pytorch component?
I suppose so... I don't know whether that transfer is really going to be a limiting factor, but if it doesn't make the code significantly harder to understand I suppose it's a nice thing to have.
On Tue, Mar 23, 2021 at 11:49 PM Piotr Żelasko @.***> wrote:
... unless I'm missing something - to me it seems like this sort of functionality should be a standard pytorch component?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/lhotse-speech/lhotse/issues/243#issuecomment-805014712, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZFLO55BI7JCOYX2VB2EV3TFC2B7ANCNFSM4ZVOMNOQ .
If I remember correctly, DataLoader in PyTorch does not support GPU when num_workers is greater than one.
@pzelasko Is there a way to store wavefoem in hdf5 format, then we can load data fast and do on-the-fly augmentation?
Yeah, it is definitely doable. I don’t know how much faster it would be, but I guess it won’t require to open/close new files all the time.
As a “quick” workaround you can try increasing the number of dataloader workers.
My understanding is that with PyTorch's DataLoader, we do the following
asynchronously:
synchronously:
__next__
in the training loop)tensor.to(device)
)I think we can add a class that wraps the DataLoader (
FastDataLoader
?) and performs both transfer in the background to further speed up the training. There seems to be a good example here: https://github.com/NVIDIA/apex/blob/master/examples/imagenet/main_amp.py#L265Another good one (but doesn't address the CPU -> GPU transfer) is in torchaudio: https://github.com/pytorch/audio/blob/master/torchaudio/datasets/utils.py#L276