ControlNet / AV-Deepfake1M

[ACM MM] AV-Deepfake1M: A Large-Scale LLM-Driven Audio-Visual Deepfake Dataset
https://arxiv.org/abs/2311.15308
Other
69 stars 3 forks source link

Request for Guidance on Extracting Visual Features using InternVideo #5

Open Myblackcat0216 opened 6 months ago

Myblackcat0216 commented 6 months ago

Thank you very much for your great work. How do you extract visual features with InternVideo? I can't find detailed procedures and steps on the official website of InternVideo. Any assistance would be greatly appreciated. Thank you.

ControlNet commented 6 months ago

Hi @Myblackcat0216 , I think you can have a look to the code here. https://github.com/OpenGVLab/InternVideo/tree/main/InternVideo1/Pretrain/VideoMAE