Closed ireneMsm2020 closed 8 months ago
batch size=64, steps=30000-60000, resolution=768*768
batch size=64, is it means every A100 GPU set 8, and training is 8 A100? I trained in 80G A100 GPU, the max batch_size can set in a single A100 is 10.
You can use accelerate or gradient accumulation, or 16 cards.
I trained in stage_1, but the loss does not decrease?,is is correct?