astorfi / lip-reading-deeplearning

:unlock: Lip Reading - Cross Audio-Visual Recognition using 3D Architectures
Apache License 2.0
1.84k stars 321 forks source link

How to synchronize your audio and image frame? #3

Closed ss87021456 closed 7 years ago

ss87021456 commented 7 years ago

Hello, I wonder to know that how do you deal with the lip-movement frame and audio synchronization? Since the input FPS may vary from videos to videos, e.g. 30 FPS means 33ms per frame, so each frame will represent 33ms audio. How do you deal with the video-audio corresponding pair?

astorfi commented 7 years ago

We took 0.3-sec synced frames that each of them consists 15 audio frames and 9 video frames and they are correspondent. Please refer to the paper for further details (Section IV).

jigyasubagai commented 7 years ago

Okies Thanks

On Wed, Oct 4, 2017 at 9:21 AM, Amirsina Torfi notifications@github.com wrote:

Closed #3 https://github.com/astorfi/lip-reading-deeplearning/issues/3.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/astorfi/lip-reading-deeplearning/issues/3#event-1277215265, or mute the thread https://github.com/notifications/unsubscribe-auth/AdXUrLEWllHuTcEDHgo7cAhjlj2wDr0hks5sowC_gaJpZM4PdjtI .