MRzzm / DINet

The source code of "DINet: deformation inpainting network for realistic face visually dubbing on high resolution video."
895 stars 167 forks source link

Tips for train syncnet #73

Open KimGyeongsu opened 7 months ago

KimGyeongsu commented 7 months ago

I finally acheived sync loss ~0.2 with private dataset with simple modification. Please understand that I can't upload the training code because I'm belong to the company. I hope that my advice should be helpful.

  1. I used coarse-to-fine strategy from mouth region 64, 128, 256 and frame to clip training.
  2. I used same scheduler, optimizer, and hyperparmeters for dinet trainng.
  3. Fine-tuing the learning rate parameter really helps me.
  4. I clipped the sync_score between 0~1 while preserving gradient.
won-bae commented 7 months ago

Hi thanks for sharing the tips!

May I know the learning rate you used eventually? I guess many people including myself have limited computation to tune it multiple times. It would be greatly helpful if you can share it!

Thank you

KimGyeongsu commented 7 months ago

I used initial learning rate as 1e-5!

won-bae commented 7 months ago

Thank you for your quick reply! Sorry but did you use 1e-5 only for clip or for every stage of frame training?

KimGyeongsu commented 7 months ago

For all stages, I used 1e-5.

won-bae commented 7 months ago

Thanks for sharing that!

FacePoluke commented 6 months ago

Did you train SyncNet from scratch or use the provided pre-trained model? Also, I thought SyncNet only had a clip mode, but in your above response, it seems like there is a frame mode mentioned.

KimGyeongsu commented 6 months ago

Similar to dinet training code that author provided, I trained syncnet from scratch. For frame stage, audio feature should be deepspeech featrue from [n-2:n+3], if we use n th frame as face feature.

xiao-keeplearning commented 6 months ago

Hello, I also have a question. How can we determine whether the model has converged during the training process of DINet?

jinwonkim93 commented 4 months ago

@KimGyeongsu what loss function did you use for training?