Closed purzelrakete closed 3 years ago
Have a look at when torch.utils.data.distributed.DistributedSampler is necessary.
torch.utils.data.distributed.DistributedSampler
Test that the dataset is predictably iterated once per epoch, across all devices. Understand when DistributedSampler is necessary.
Each worker should get a different random batch in each step. Each examples should been seen once per epoch across all workers.
Just saw an implementation using DistributedSampler when using DistributedDataParallel training.
Handled in #26
What
Have a look at when
torch.utils.data.distributed.DistributedSampler
is necessary.How to reproduce
Test that the dataset is predictably iterated once per epoch, across all devices. Understand when DistributedSampler is necessary.
Expected
Each worker should get a different random batch in each step. Each examples should been seen once per epoch across all workers.
Additional context
Just saw an implementation using DistributedSampler when using DistributedDataParallel training.