KAIST-AILab / SyncVSR

SyncVSR: Data-Efficient Visual Speech Recognition with End-to-End Crossmodal Audio Token Synchronization (Interspeech 2024)
https://www.isca-archive.org/interspeech_2024/ahn24_interspeech.pdf
MIT License
14 stars 1 forks source link

Can you provide the training code for CAS-VSR-W1k? #8

Closed wuhongsheng closed 1 week ago

snoop2head commented 1 week ago

We've used same training code on LRW/video but with different preprocessing schemes. I will upload the preprocessing script ASAP.

snoop2head commented 1 week ago

It seems like we used the file from prepare_lrw1000.py. Please refer to the attached file.