Closed AnnaTrainingG closed 5 months ago
@niuliling123 thanks for the report!
The goal of the auto_dataloader
is to create a dataloader in DDP mode such that provided batch size is the total batch size.
batch size is scaled by world size:
batch_size / world_size
if larger or equal world size.
In your case, I agree that it can be confusing the change from batch size 1 to 16:
case | batch_size | world_size | total batch size |
---|---|---|---|
A | 1 | 8 | 8 |
B | 16 | 8 | 16 |
C | 32 | 8 | 32 |
D | 30 | 8 | 24 = 8 * (30 // 8) |
We may expect that case A should have another total batch size. For the input batch size smaller than the world_size, we can't set the total batch size equal to the input batch size as we can't split it for all participating procs, that's why it is unscaled.
If you have a suggestion here, we are happy to discuss it. Maybe, we can show a warning for the case A.
Maybe set the same mean of batch_size with pytorch will be ok
🐛 Bug description
HI, when I use the auto_dataloder, I really confused about the code in the image, If I update my batch_size from 1 to 16, and world_size still is 8, Need I set batch_size = batch_size world_size. eg: auto_dataloder( batch_size = batch_size world_size ... ? thanks