Hi, thanks for the great work.
I'm able to reproduce the score claimed in the paper using our pre-trained model weights. However, when I tried to train Lip2wav on the chem speaker without the weights. the score seems not very good. Here is the result I got:
# Speaker: chem
# Lip2wav - using pre-trained weights
Mean PESQ: 1.2984
Mean STOI: 0.4285
Mean ESTOI: 0.3204
# Lip2wav - checkpoint on steps 230k
Mean PESQ: 1.1618
Mean STOI: 0.3245
Mean ESTOI: 0.1539
Tensorboard:
Both scores get under the same training environment (ffmpeg version 2.8.17) and the same codebase. What could be the potential issue of this?
Hi, thanks for the great work. I'm able to reproduce the score claimed in the paper using our pre-trained model weights. However, when I tried to train Lip2wav on the
chem
speaker without the weights. the score seems not very good. Here is the result I got:Tensorboard:![Screen Shot 2021-09-06 at 8 39 30 pm](https://user-images.githubusercontent.com/43364163/132204930-5f79dce4-cb9b-4b78-ac4a-fff6c5d06e25.png)
Both scores get under the same training environment (ffmpeg version 2.8.17) and the same codebase. What could be the potential issue of this?