batch size when training

sail-sg / MDT

Masked Diffusion Transformer is the SOTA for image synthesis. (ICCV 2023)

Apache License 2.0

500 stars 35 forks source link

Open tengjiayan20 opened 1 year ago

tengjiayan20 commented 1 year ago

Batch size is said to be 256 in the article. But why batch size in run.sh is 32? And why batch size in run_ddp_master.sh is 4?

gasvn commented 1 year ago

32 is for one GPU. 256 = 32 X 8

tengjiayan20 commented 1 year ago

Thank you!

ILLLLUSION commented 11 months ago

Does setting different batch-sizes on a single gpu have a big impact on the final result?

gasvn commented 11 months ago

We align the batch size with the DiT, so we didn't try other settings.