Unable to reproduce the score claimed in paper

Hi, thanks for the great work. I'm able to reproduce the score claimed in the paper using our pre-trained model weights. However, when I tried to train Lip2wav on the chem speaker without the weights. the score seems not very good. Here is the result I got:

# Speaker: chem
# Lip2wav - using pre-trained weights 
Mean PESQ: 1.2984
Mean STOI: 0.4285
Mean ESTOI: 0.3204

# Lip2wav - checkpoint on steps 230k 
Mean PESQ: 1.1618
Mean STOI: 0.3245
Mean ESTOI: 0.1539

Tensorboard: Screen Shot 2021-09-06 at 8 39 30 pm

Both scores get under the same training environment (ffmpeg version 2.8.17) and the same codebase. What could be the potential issue of this?

Rudrabha / Lip2Wav

Unable to reproduce the score claimed in paper #35