rizkiarm / LipNet

Keras implementation of 'LipNet: End-to-End Sentence-level Lipreading'
MIT License
627 stars 224 forks source link

Will unseen model predict for any video content or content only from GRID.txt ? #98

Open chahatagarwal opened 4 years ago

chahatagarwal commented 4 years ago
jainnimish commented 3 years ago

This model is only trained for GRID dataset. If your video is saying "hello", it won't predict "hello". Instead it will predict some 6 word sentence based on command(4) + color(4) + preposition(4) + letter(25) + digit(10) + adverb(4). Even with unseen model, you can only predict unseen speaker's video that is in the form of command(4) + color(4) + preposition(4) + letter(25) + digit(10) + adverb(4).