The pre-processed mouth ROIs

facebookresearch / VisualVoice

Audio-Visual Speech Separation with Cross-Modal Consistency

Other

223 stars 35 forks source link

Open dengyuanjie opened 1 year ago

dengyuanjie commented 1 year ago

Hello, I would like to ask a question.

Regarding the mouth data in the dataset, it is stored as an h5 file.

Could you please explain how it was generated? Is there a pre-trained model available?

If I want to replace VoxCeleb2 with a different dataset, how can I generate the mouth h5 files?

Looking forward to your answer! Thank you very much!!

wcycqjy commented 9 months ago

I have the same question. I want to use LRS2 alternatively.