My train speed is much slower than yours

kinglon commented 3 years ago

How can I improve the train speed?

It takes 2 hours to make a train with 1000 iterations on 2x GeForce RTX 3090, and 10000k will need 833 days but your train only 20 days.

my train command is as follows : python -m torch.distributed.launch --nproc_per_node=2 --master_port=9999 train.py --num_worker 4 --resolution 1024 --name Jeric --iter 1000 --batch 1 --mixing 0.9 path-to-your-image-folders --condition_path path-to-your-segmap-folders

path-to-your-image-folders, set to the CelebA-HQ-img folder of Celeb dataset.
path-to-your-segmap-folders , set to the CelebAMask-HQ folder downloaded from your pre-process ffhq and celeba segmaps.
trained on Windows 10

Thanks.

apchenstu commented 3 years ago

I have tested it on win10, I think you may try it on ubuntu.

kinglon commented 3 years ago

I have ever trained it on Ubuntu, the problem still exists. I found most time is used on the backward, it is about 56 seconds, so I tried to add the option --g_reg_every 32 to accelerate the train speed, do it bring any new problems?

apchenstu commented 3 years ago

I remember we have the same issue before, but the issue disappear after switching to another server, which running with RTX 2080 Ti. so I also don't know how to resolve your issue

kinglon commented 3 years ago

I will make a try by switching to another server too.

apchenstu / sofgan

My train speed is much slower than yours #16