rizkiarm / LipNet

Keras implementation of 'LipNet: End-to-End Sentence-level Lipreading'
MIT License
635 stars 226 forks source link

Problems reproducing Unseen speakers results #20

Open shaniye opened 6 years ago

shaniye commented 6 years ago

Thanks a lot for the great job you've done on this project!

I'm having some difficulties reproducing the results you've got on the unseen speakers.

As You mentioned in the Readme you’ve reached the following results:

Scenario Epoch CER WER BLEU Unseen speakers 178 6.19% 14.19% 88.21%

I'm running on Ubuntu 16.04 - GPU Nvidia 1080TI

I didn't change the code!

I used 28775 videos for training, and 3966 videos for validation (speakers: 1, 2, 20, 22)

but I only got the following results:

Epoch Samples Mean CER Mean CER (Norm) Mean WER Mean WER (Norm) Mean BLEU Mean BLEU (Norm) 178 256 5.36328 0.22138 1.94531 0.32422 0.6903 0.6903 324 256 4.97656 0.20456 1.61328 0.26888 0.71737 0.71737

  1. Does the 14.19% stand for the Mean WER (Norm)?
  2. Are the results you’ve posted are for running on 256 validation examples or on all the 3966 validation videos for the saved model from epoch 178?
  3. Any ideas what I'm doing wrong and not being able to reach the same results as you have?

Thank you!

mezhou commented 6 years ago

My replication about the LipNet experiment is quite the same as yours - different from the authors Have you found the reason ? With my appreciation^_^

Fengdalu commented 4 years ago

My replication about the LipNet experiment is quite the same as yours - different from the authors Have you found the reason ? With my appreciation^_^

My PyTorch Version is accuracy: https://github.com/Fengdalu/LipNet-PyTorch . Contribution in sharing the results of this model is highly appreciated