May I ask if the training time is not accurate

Nota-NetsPresso / BK-SDM

A Compressed Stable Diffusion for Efficient Text-to-Image Generation [ECCV'24]

Other

238 stars 16 forks source link

May I ask if the training time is not accurate #48

Closed StormArcher closed 8 months ago

StormArcher commented 8 months ago

batch size is 64 (256=4x64), train BK-SDM-Base by single A100 for 50K iteractions takes about 300 hours batch size is 16 (64=4x16), train BK-SDM-Base by single A100 for 50K iteractions takes about 60 hours ??? in fact ,it is 600 hours??

bokyeong1015 commented 8 months ago

Hi,

60 hours is correct. Given the same number of iterations (50K) and gradient_accumulation_steps (4), a smaller mini-batch size (16) results in faster training compared to a larger one (64).