yerfor / GeneFacePlusPlus

GeneFace++: Generalized and Stable Real-Time 3D Talking Face Generation; Official Code
MIT License
1.5k stars 219 forks source link

Train Loss #25

Closed ChengsongLu closed 8 months ago

ChengsongLu commented 8 months ago

In head model training step, is it normal for total_loss to drop to 0.05 at about 50,000steps and then keep rising to >50? How should I avoid it?

What size of data (mins?) is needed for this training? Does the video have to contain only the person's head?

Thanks!!!

yerfor commented 8 months ago

About the loss rising to >50: it's really weird, are you convenient to share the training video or tensorboard?

Data scale: we tried 1minute / 3 minutes / 5 minutes. they all work well.

Does the video have to contain only the person's head: No, we use the com_imgs in the data/processed/<vid_id> during training.

ChengsongLu commented 8 months ago

https://github.com/yerfor/GeneFacePlusPlus/assets/61783323/6bd5aa12-e570-4bd4-a40f-c38bd6173546

Here is the video I used (this is just the clip of the first few seconds), I guess the reason could be that the length and width of the video are too different?

yerfor commented 8 months ago

That's it. We only support videos of 512x512 resolution. You need to crop the face into 512x512 like the following. Snipaste_2024-02-05_13-09-20

ChengsongLu commented 8 months ago

Okay. Thanks,

Btw, is the 250_000 training steps necessary? Or we can stop training when the loss reachs a specific value? (aks for both head and torso training stages.)

yerfor commented 8 months ago

You can try smaller training steps, we choose 250k steps to fairly compare with previous NeRF-based methods. I would suggest you to try 150k steps (set max_update: 150_000 and lpips_start_iters: 140_000 in the .yaml config file).

It is not recommended to end with loss value, as the loss values are scaled case-to-case to different training videos.

ChengsongLu commented 8 months ago

Thanks again for your suggestions.

Screenshot from 2024-02-05 18-25-16 I have cropped the video as you suggested. However, as you can see in the above image, the loss re-rise is still there.

I was wondering if we should use the best HEAD model (with lowest loss) to train the TORSO model? I found that the code saves the best model but never uses it, should you add an optional parameter for the 2nd training step (torso model training) so that we can choose to use the best model?

ChengsongLu commented 8 months ago

Screenshot from 2024-02-05 18-41-35 After checking the log file, I found that the reason of the loss re-rise is that some 'lpips' relative losses are added.

And now I see why dont we use the best model... XD

ChengsongLu commented 8 months ago

https://github.com/yerfor/GeneFacePlusPlus/issues/39#issuecomment-1932568756