guanjz20 / StyleSync

Official code of CVPR '23 paper "StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator"
303 stars 19 forks source link

Large GPU memory cost& Slow train speed& Acceptable inference speed. #7

Open Tony199110 opened 1 year ago

Tony199110 commented 1 year ago

I wrote the full training code as I interpreted the paper. After a week of training, I found some problems and wish author's respond. 1. when batch size = 2, one 4090's memory runs out, which means one set of frames costs about 10GB memory. 2. As mentioned in 1, one epoch of 1/5 VoxCeleb2 dataset cost nearly a day. 3. I use intermediate checkpoint to do inference in one 4090, the fps is about 40. So, i'd like to know much training time is needed base which kind of GPU. Thanks in advance.

afzal-mengal commented 1 year ago

Hi tony, can you please send the training code you wrote? also did you figure out the GPU requirements?

iamkhalidbashir commented 1 year ago

Hey @Tony199110 I am interested as well

saeedfirouzi commented 1 year ago

Hi @Tony199110 it would be great if you send us your training file.

lmpeng12 commented 12 months ago

@Tony199110 Hi tony, can you please send the training code you wrote?

yassineAlouini commented 4 months ago

In order to help you optimize your training loop, please share your code or at least some code snippets @Tony199110. 👌