taotaowang97479 / MFNet-SpeechEnhancement

This is the unofficial implementation of MFNet, from paper''a Mask Free Neural Network for Monaural Speech Enhancement''
10 stars 0 forks source link

MFNet

This is the unofficial implementation of MFNet, from paper''a Mask Free Neural Network for Monaural Speech Enhancement''

arxiv:https://arxiv.org/abs/2306.04286

I appreciate the guidance and assistance from the author. After the correction following our discussion:
1.The initial learning rate is 3e-4, correcting the value from 0.0034 in the paper.
2.The features input to the network are compressed spectra, i.e., input = sign(stdct) * sqrt(stdct).
3.DCT transformation without normalization.
I put the key code of STDCT, which may be useful for you.

Result

This experiment did not utilize the warm-up strategy mentioned in the paper. Instead, following the author's recommendation, the training parameters were set as follows:

Performance of MFNet on the Voicebank+Demand (VCTK) test set:

PESQ STOI SI-SNR
Noisy 1.9799 92.11 8.4474
MFNet 3.0141 94.56 18.7835

Additionally, these are the best results on the test set obtained during the first 100 epochs of training.
pesq stoi snr tr_loss val_loss