celebv-text / CelebV-Text

(CVPR 2023) CelebV-Text: A Large-Scale Facial Text-Video Dataset
https://celebv-text.github.io/
388 stars 33 forks source link

How to use celebvhq-text for MMVID #20

Open 9B8DY6 opened 8 months ago

9B8DY6 commented 8 months ago

I want to use pre-trained MMVID with celebvhq-text so i want to know how long text sequence should be for how long frames. is it the same with mmvid training config trained on MM-Vox (frames_num = 8, text sequence = 50)?

And in paper, the text descriptions contain all the "action, face attributes, emotion...etc" information, but you uploaded them separately. Then, let us know how to integrate them into one and which sentence belongs to what frames.

Thank you.