yxlu-0102 / MP-SENet

Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement
MIT License
293 stars 44 forks source link

a question about inference #9

Closed lth456321 closed 6 months ago

lth456321 commented 11 months ago

Thank you very much for your paper and the disclosure of the code. I have a question about the inference. I used the provided ckpt and used inference. py to generate enhanced speech, and then used cal_metrics. py file calculates metrics, but the metrics differ significantly from those in the article. Throughout the process, I only replaced librosa.load with soundfile.read, as librosa.load does not work on my computer. Can you help me analyze the reasons for this result? Anyway, thank you again.

yxlu-0102 commented 11 months ago

Thanks for your interest in our work, I re-downloaded the MP-SENet code in this repo and generated speech with librosa and soundfile respectively. After calculating results by cal_metrics.py, the objective results of both read methods are the same, namely

These results are consistent with our Interspeech paper. So, this issue is not related to thesoundfile. Perhaps you can check if your dataset is consistent with ours.