yxlu-0102 / MP-SENet

MP-SENet: A Speech Enhancement Model with Parallel Denoising of Magnitude and Phase Spectra
MIT License
267 stars 40 forks source link

Inferior results trained from scratch #11

Closed XCYu-0903 closed 6 months ago

XCYu-0903 commented 8 months ago

Hello! Your paper and codes are very enlightening to me and I tried to train the model from scratch on VCTK-DEMAND dataset to reproduce the results, but I found that the metrics are rather lower than those provided in the article. PESQ, CSIG, CBAK and COVL are merely about 3.39, 4.67, 3.84 and 4.14, respectively. I modified the following parts of the codes:

  1. I changed segment_size to 16000 (1s) in config.json and split=False in validset configuration to be suitable for my limited GPU (batch_size is 4 and codes are deployed on 2 x 2080Ti). Besides, I use CPU to carry out inference.py for non-split of testset.
  2. In my preference, I only validate the validset and save the checkpoint once per epoch. I'm not sure whether the above modifications will affect the performance of the model. Looking forward to your reply~
yxlu-0102 commented 8 months ago
  1. Due to the use of self-attention in our model to capture global information, the length of segments has a significant impact on model performance.

  2. Additionally, validating only once per epoch may be too infrequent. You can perform more validations to find relatively good checkpoints, but this should have a minor impact.

XCYu-0903 commented 8 months ago

Dear yxlu-0102: Thank you for your prompt reply! Following your suggestions, I will restore segment_size to 32000, modify batch=2 (1 batch per GPU) and retrain the model later.

huaidanquede commented 8 months ago

How many steps did you train? I trained on 2x3090 with default config for 160k steps but the best pesq checkpoint stuck at 3.27 since 55k steps. Looking forward to your reply.

yxlu-0102 commented 8 months ago

I trained the model for about 500k steps, but the value of PESQ will not vary significantly between 300k and 500k steps. You can observe the value of PESQ through tensorboard.