henryqin1997 / InfoBatch-ImageNet

InfoBatch Implementation on ImageNet
0 stars 0 forks source link

Safe way to change the length of loader every epoch when using DDP #1

Open Kou-99 opened 2 months ago

Kou-99 commented 2 months ago

Hi, thank you for your inspiring work! I am curious about how to change the length of the dataloader (i.e. prune the training data) without recreation a new dataloader every epoch when using distributed dataparallel. I found that naively changing the length of the sampler after every epoch does not affect the dataloader. I find some code in infobatch_dataloader.py. After reading the code, I feel like I can achieve flexible length for dataloader by simply replacing DistributedSampler with DistributedSamplerWrapper in infobatch_dataloader.py without other modifications but I am not 100% sure. Can you give me some guidance? Thank you for your time and help!

henryqin1997 commented 2 months ago

Hi, our implementation uses a customized DistributedSamplerWrapper to achieve flexible length according to pruning result.

There are several things to be careful with, as the distributed mode has multiple processes, with multiple dataset/dataloader objects. The InfoBatchSampler generates the same pruned list on each process with a numpy pseudo-random generator and same seed; then the indexes are used to do subset sampling, that is what DistributedSamplerWrapper did. It divides the pruned subset (indexes) to each process to retrieve the data in the same way a distributed sampler do, only different in that it samples on dynamic subset indexes instead of fixed full indexes.