How long does training take with default training settings?

geyuying / PF-AFN

Official code for "Parser-Free Virtual Try-on via Distilling Appearance Flows", CVPR 2021.

545 stars 138 forks source link

How long does training take with default training settings? #66

Open mesllo opened 2 years ago

mesllo commented 2 years ago

If I train with 8 GPUs, a batch size 4, and 100 epochs (50 for normal lr, 50 for decaying lr), how long would it take? I'm asking because I only have 4 GPUs and I am using a shared environment where I won't be able to train for more than 8 hours per run.

hanchaoyuan commented 2 years ago

The complete training has four stages, the third stage has 200 epochs and takes the longest. The training code prints the estimated time, you can try it

mesllo commented 2 years ago

Thank you for your reply! I have tried it and even for the first stage, training takes about 24 hours on 4 GPUs which is not feasible for me. I can also try to take a multi-node approach and run on 8 GPUs across 2 nodes. Is the code already optimized for multiple nodes or do I have to do this myself? From what I can understand, it seems that the current code only considers a single node.