VIPL-Audio-Visual-Speech-Understanding / learn-an-effective-lip-reading-model-without-pains

The PyTorch Code and Model In "Learn an Effective Lip Reading Model without Pains", (https://arxiv.org/abs/2011.07557), which reaches the state-of-art performance in LRW-1000 dataset.
152 stars 37 forks source link

Where can I find the code that places the target word in the middle of the frames? #23

Closed Leebh-kor closed 1 year ago

Leebh-kor commented 1 year ago

Hi

As introduced in the paper, I confirmed codes that the frame is fixed to 40 for each word in the scripts/prepare_lrw1000.py.

However, it was not confirmed that the target word was placed in the middle of the frames.

Do you know where I can find the codes for this?

Fengdalu commented 1 year ago

Provided in the released files in LRW-1000 datasets. For application of LRW-1000, see: https://github.com/VIPL-Audio-Visual-Speech-Understanding/AVSU-VIPL#lrw-1000-a-naturally-distributed-large-scale-benchmark-for-lip-reading-in-the-wild-fg-2019