Inferior results trained from scratch

XCYu-0903 commented 8 months ago

Hello! Your paper and codes are very enlightening to me and I tried to train the model from scratch on VCTK-DEMAND dataset to reproduce the results, but I found that the metrics are rather lower than those provided in the article. PESQ, CSIG, CBAK and COVL are merely about 3.39, 4.67, 3.84 and 4.14, respectively. I modified the following parts of the codes:

I changed segment_size to 16000 (1s) in config.json and split=False in validset configuration to be suitable for my limited GPU (batch_size is 4 and codes are deployed on 2 x 2080Ti). Besides, I use CPU to carry out inference.py for non-split of testset.
In my preference, I only validate the validset and save the checkpoint once per epoch. I'm not sure whether the above modifications will affect the performance of the model. Looking forward to your reply~

yxlu-0102 commented 8 months ago

Due to the use of self-attention in our model to capture global information, the length of segments has a significant impact on model performance.
Additionally, validating only once per epoch may be too infrequent. You can perform more validations to find relatively good checkpoints, but this should have a minor impact.

XCYu-0903 commented 8 months ago

Dear yxlu-0102: Thank you for your prompt reply! Following your suggestions, I will restore segment_size to 32000, modify batch=2 (1 batch per GPU) and retrain the model later.

huaidanquede commented 8 months ago

How many steps did you train? I trained on 2x3090 with default config for 160k steps but the best pesq checkpoint stuck at 3.27 since 55k steps. Looking forward to your reply.

yxlu-0102 commented 8 months ago

I trained the model for about 500k steps, but the value of PESQ will not vary significantly between 300k and 500k steps. You can observe the value of PESQ through tensorboard.

yxlu-0102 / MP-SENet

Inferior results trained from scratch #11