JusperLee / CTCNet

An Audio-Visual Speech Separation Model Inspired by Cortico-Thalamo-Cortical Circuits
Apache License 2.0
69 stars 16 forks source link

Alternative to Baidu driver #14

Open ahmadikalkhorani opened 7 months ago

ahmadikalkhorani commented 7 months ago

Hi,

Can you please provide an alternative link for the Baidu driver? Or share the code to generate the test sets (LRS2/LRS3/Vox2)?

Thanks

JusperLee commented 7 months ago

https://1drv.ms/f/s!Aq6D76K4gTu6gUiOHDDhZaZRNX5p?e=Js677R

redizzy commented 2 weeks ago

hi I download the lrs2 and unzip,Screenshot as follows

1

which is the visual data? mouths? Why are there 20392 .npz in folder A, and the number of .npz does not match the number of data in the audio?

JusperLee commented 2 weeks ago

hi I download the lrs2 and unzip,Screenshot as follows 1 which is the visual data? mouths? Why are there 20392 .npz in folder A, and the number of .npz does not match the number of data in the audio?

Please take a look at this code to understand. https://github.com/JusperLee/CTCNet/blob/main/local/preprocess_lrs2.py