uuembodiedsocialai / FaceDiffuser

Other
138 stars 19 forks source link

Questions about training on VOCASET #8

Open ChenVoid opened 1 year ago

ChenVoid commented 1 year ago

Overfitting happens when the model is trained on VOCASET. The training loss is descending while the validation loss is rising step by step when training. So is VOCASET too small? Dose anyone meets the same problem?

Starind commented 11 months ago

I meet the same problem, and I don't know why

YifengMa9 commented 8 months ago

Hi, I have retrained the model on VOCASET without overfitting. The performance is similar to the official checkpoint.

The script I used is:

python main.py --dataset vocaset --vertice_dim 15069 --feature_dim 256 --output_fps 30 --train_subjects FaceTalk_170728_03272_TA\ FaceTalk_170904_00128_TA\ FaceTalk_170725_00137_TA\ FaceTalk_170915_00223_TA\ FaceTalk_170811_03274_TA\ FaceTalk_170913_03279_TA\ FaceTalk_170904_03276_TA\ FaceTalk_170912_03278_TA --val_subjects FaceTalk_170908_03277_TA\ FaceTalk_170811_03275_TA --test_subjects FaceTalk_170809_00138_TA\ FaceTalk_170731_00024_TA --diff_steps 1000 --gru_dim 256

I think the sucessful retraining is owing to several changes:

  1. I change the --gru_dim to 256. The gru_dim setting is different from the paper but identical to the released checkpoint.
  2. I make sure that the training, dev, and test set setting is identical to the paper. The setting should be specified by --train/val/test_subjects.
  3. I fix some bugs in the script. The bugs come from the fact that the provided training script is for BIWI, not vocaset. To fix bugs, it is recommended to debug line by line.

My loss curve shows:

losses_face_diffuser_vocaset

HiouKaoru commented 7 months ago

I followed the instructions to process the downloaded the training data of vocaset and obtained an incomplete sequence. For example, I am missing FaceTalk_170809_00138-TA_sentence32.npy. I would like to know if anyone has the same issue or if anyone knows how to solve it.

paopao1027 commented 7 months ago

Hi, I have retrained the model on VOCASET without overfitting. The performance is similar to the official checkpoint.

The script I used is:

python main.py --dataset vocaset --vertice_dim 15069 --feature_dim 256 --output_fps 30 --train_subjects FaceTalk_170728_03272_TA\ FaceTalk_170904_00128_TA\ FaceTalk_170725_00137_TA\ FaceTalk_170915_00223_TA\ FaceTalk_170811_03274_TA\ FaceTalk_170913_03279_TA\ FaceTalk_170904_03276_TA\ FaceTalk_170912_03278_TA --val_subjects FaceTalk_170908_03277_TA\ FaceTalk_170811_03275_TA --test_subjects FaceTalk_170809_00138_TA\ FaceTalk_170731_00024_TA --diff_steps 1000 --gru_dim 256

I think the sucessful retraining is owing to several changes:

  1. I change the --gru_dim to 256. The gru_dim setting is different from the paper but identical to the released checkpoint.
  2. I make sure that the training, dev, and test set setting is identical to the paper. The setting should be specified by --train/val/test_subjects.
  3. I fix some bugs in the script. The bugs come from the fact that the provided training script is for BIWI, not vocaset. To fix bugs, it is recommended to debug line by line.

My loss curve shows:

losses_face_diffuser_vocaset

The validation set curve does not decrease as steadily as the training set curve, as shown in the graph. Doesn't this also count as overfitting