Questions about computing resources and training time

nintendops / latent-code-inpainting

PyTorch implementation for the paper Don't Look into the Dark: Latent Codes for Pluralistic Image Inpainting (CVPR2024).

MIT License

12 stars 2 forks source link

Hi, please see below for my answers:

Q: Computing resources and training time: my models were trained with 3 V100 GPUs with the default settings from the configuration files. Training time usually takes about 5 days for the transformer, and 2-3 days for other modules under this setting.

Q: Are all your models trained on three 3 NVidia V100 GPUs with a batch size of 8 ?

The "batch size" in the training parameters refers to batch size per GPU, so if it is set to 8 and trained on 3 V100, effectively the model is trained with a bs of 24. If I remember correctly, our models were trained with a total batch size of 36 for the VQGAN part (and the bootstrapped encoder/decoder) and a bs of 24 for the transformer.

nintendops / latent-code-inpainting

Questions about computing resources and training time #1