feldberlin / wavenet

An unconditioned Wavenet implementation with fast generation.
3 stars 0 forks source link

Distributed data sampler may be incorrect #16

Closed purzelrakete closed 3 years ago

purzelrakete commented 3 years ago

What

Have a look at when torch.utils.data.distributed.DistributedSampler is necessary.

How to reproduce

Test that the dataset is predictably iterated once per epoch, across all devices. Understand when DistributedSampler is necessary.

Expected

Each worker should get a different random batch in each step. Each examples should been seen once per epoch across all workers.

Additional context

Just saw an implementation using DistributedSampler when using DistributedDataParallel training.

purzelrakete commented 3 years ago

Handled in #26