joannahong / Lip2Wav-pytorch

a PyTorch implementation of Lip2Wav
48 stars 10 forks source link

Some quantitative results #2

Open enhuiz opened 3 years ago

enhuiz commented 3 years ago

Hi, I’m recently trying to reproduce the scores in the paper. I have tried the model given by the authors but failed to get similar results in their paper.

I just tried your implementation and the pre-trained model on the test set of chem. The generated speech sounds good to me. While by using this script for scoring, I got the following results:

Mean PESQ: 1.195829279899597 Mean STOI: 0.3033812165587589 Mean ESTOI: 0.1818759115844963

which seems still not as good as what the paper reports. Have you tried to run the scoring and are those scores similar to the paper?

joannahong commented 3 years ago

I am also still figuring out why the result comes out differently, even in actual Lip2wav repository, while the generated ones actually sound fine. As authors mentioned, perhaps it is because of different versions of some libraries.. I will let you know when I figure out the problem.