questions about training step

ckczzj / PDAE

Official PyTorch implementation of PDAE (NeurIPS 2022)

270 stars 20 forks source link

questions about training step #6

Open zoelovecoffee opened 1 year ago

zoelovecoffee commented 1 year ago

Hi, when running your code representation_learning_trainer.py, I was confused about the step parameter. For example, to train "FFHQ128-130M-z512-64M" with batch_size = 128, how does the step parameter related to "64M" training samples: as FFHQ contains 70000 images, does it indicate (64000000/70000) times of iterations, and does step parameter stand for number of training epoch?

Could you let me know if I have any misunderstanding. Thank you:)

ckczzj commented 1 year ago

Thanks for your attention. A step means a step of optimization, and training_samples = batch_size * step . We don't explicitly have a parameter to stop the training process and it can infinitely run. So "FFHQ128-130M-z512-64M" with batch_size = 128 means you need train it until step $\ge$ (64000000/128) and munually stop it.

zoelovecoffee commented 1 year ago

Thank you for answering my question, it helps a lot!