YanzuoLu / CFLD

[CVPR 2024 Highlight] Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis
MIT License
165 stars 11 forks source link

What is the minimum VRAM usage? #1

Closed ninjasaid2k closed 6 months ago

YanzuoLu commented 6 months ago

Thanks for your attention to our work. Do you mean the GPU memory usage?

We train our model both on 24GB Geforce RTX 3090 and 48GB RTX A6000 GPU.

Since our scripts support gradient accumulation, you can achieve the minimum training configuration requirement with micro batch size = 1 (1 sample per GPU per iteration) and accumulate up to the total batch size (for optimization) as you want.

That is to say, GPUs with less VRAM than 24GB (maybe 16GB or 12GB) should also be okay. You can give it a try.

Amazingren commented 6 months ago

Thanks for your attention to our work. Do you mean the GPU memory usage?

We train our model both on 24GB Geforce RTX 3090 and 48GB RTX A6000 GPU.

Since our scripts support gradient accumulation, you can achieve the minimum training configuration requirement with micro batch size = 1 (1 sample per GPU per iteration) and accumulate up to the total batch size (for optimization) as you want.

That is to say, GPUs with less VRAM than 24GB (maybe 16GB or 12GB) should also be okay. You can give it a try.

Thanks for the nice work!

How long will it take to train your models on the Market1501 and Deepfashion datasets, respectively? (Let's say with 8 4090 gpus).

It would be super helpful for us to follow. Best regards,

YanzuoLu commented 6 months ago

Thanks for your attention to our work. Do you mean the GPU memory usage? We train our model both on 24GB Geforce RTX 3090 and 48GB RTX A6000 GPU. Since our scripts support gradient accumulation, you can achieve the minimum training configuration requirement with micro batch size = 1 (1 sample per GPU per iteration) and accumulate up to the total batch size (for optimization) as you want. That is to say, GPUs with less VRAM than 24GB (maybe 16GB or 12GB) should also be okay. You can give it a try.

Thanks for the nice work!

How long will it take to train your models on the Market1501 and Deepfashion datasets, respectively? (Let's say with 8 4090 gpus).

It would be super helpful for us to follow. Best regards,

Our model on DeepFashion dataset is trained on NVIDIA RTX A6000 and can be completed within 600 GPU hours (8 GPUs within 3 days), which is much less than the 960 GPU hours trained on NVIDIA A100 in PIDM (8 GPUs within 5 days) since NVIDIA RTX A6000 is much less capable than NVIDIA A100.

As for Market-1501 dataset, we conduct training on NVIDIA A100 and consume 300 GPU hour.

You can make a simple estimation of your current GPU based on these times. Thank you!