Closed yzxing87 closed 2 years ago
It depends on the complexity of data. For example, on CUB200, it needs about 5days, on MSCOCO, it needs about 14days (longer training may slightly improve the performance). And the VQ-Diffusion-F model can achieve better results in the same computation cost.
Thanks for your clarification!
Thanks for releasing the codes of this awesome work! May I know the training cost of the VQ-Diffusion-B model? How long does the training take when using 8 V100 GPUs?