Confused about Batch_size

AnnaTrainingG commented 5 months ago

🐛 Bug description

HI, when I use the auto_dataloder, I really confused about the code in the image, If I update my batch_size from 1 to 16, and world_size still is 8, Need I set batch_size = batch_size world_size. eg: auto_dataloder( batch_size = batch_size world_size ... ? thanks

vfdev-5 commented 5 months ago

@niuliling123 thanks for the report!

The goal of the auto_dataloader is to create a dataloader in DDP mode such that provided batch size is the total batch size.

batch size is scaled by world size: batch_size / world_size if larger or equal world size.

In your case, I agree that it can be confusing the change from batch size 1 to 16:

case	batch_size	world_size	total batch size
A	1	8	8
B	16	8	16
C	32	8	32
D	30	8	24 = 8 * (30 // 8)

We may expect that case A should have another total batch size. For the input batch size smaller than the world_size, we can't set the total batch size equal to the input batch size as we can't split it for all participating procs, that's why it is unscaled.

If you have a suggestion here, we are happy to discuss it. Maybe, we can show a warning for the case A.

AnnaTrainingG commented 5 months ago

Maybe set the same mean of batch_size with pytorch will be ok

pytorch / ignite

Confused about Batch_size #3234

🐛 Bug description