nintendops / latent-code-inpainting

PyTorch implementation for the paper Don't Look into the Dark: Latent Codes for Pluralistic Image Inpainting (CVPR2024).
MIT License
12 stars 2 forks source link

Questions about computing resources and training time #1

Open tanbuzheng opened 3 months ago

tanbuzheng commented 3 months ago

Dear author, Thanks for sharing the code. I am greatly interested in your work. I am troubled by computing resources and training time, so I have some questions for you and would like your reply.

Are all your models trained on three 3 NVidia V100 GPUs with a batch size of 8 ? How long did it take you to train VQGAN and Transformer on places2 dataset? I have also tried to train vqgan on the places2 dataset, but found it very time consuming. When you train a transformer, is it appropriate to set the batch size to 8? Because a large batch size is generally used.

Waiting for your reply!

nintendops commented 2 months ago

Hi, please see below for my answers:

Q: Computing resources and training time: my models were trained with 3 V100 GPUs with the default settings from the configuration files. Training time usually takes about 5 days for the transformer, and 2-3 days for other modules under this setting.

Q: Are all your models trained on three 3 NVidia V100 GPUs with a batch size of 8 ?

The "batch size" in the training parameters refers to batch size per GPU, so if it is set to 8 and trained on 3 V100, effectively the model is trained with a bs of 24. If I remember correctly, our models were trained with a total batch size of 36 for the VQGAN part (and the bootstrapped encoder/decoder) and a bs of 24 for the transformer.