descriptinc / melgan-neurips

GAN-based Mel-Spectrogram Inversion Network for Text-to-Speech Synthesis
MIT License
978 stars 214 forks source link

How to measure the quality of synthetic audio with PESQ #44

Open predawnang opened 1 year ago

predawnang commented 1 year ago

Hi I want to measure the quality of the synthetic audio with metric that need reference audio, like pesq, mcd etc. I followed the preprocess code of melgan (librosa.load => trim 20db => mel), i find that some trimed original audio has difference length with the synthetic audio which make the pesq computation failed. Why does generated wavforms have difference length with the original waveform (trimed)