ZiqiaoPeng / SyncTalk

[CVPR 2024] This is the official source for our paper "SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis"
https://ziqiaopeng.github.io/synctalk/
Other
1.29k stars 154 forks source link

是否可以支持fps 30帧 #201

Open lqql8790229 opened 2 months ago

lqql8790229 commented 2 months ago

感谢作者,用这个项目进行训练和制作,效果真的非常不错,但是目前手机拍摄的视频大多在fps 30,强行转到25,有时候画面会有些抖动,这个方面,作者是否有计划支持。我看了有很多地方fps是写死25帧的,这个是出于什么考虑呢?

yyjjww commented 2 months ago

hello,您这训练出来有人脸吗,我在训练阶段就报错找不到人脸,视频符合规范 我是用官方提供的may的模型,推理出来也没有人脸

lqql8790229 commented 2 months ago

当然有的,你自己的视频需要自己训练。视频除了512*512,fps 25以外,视频图像只能包含一个人脸,就是脖子以上部分,脸占视频内容尽可能的大,另外,视频的音频一定要清晰。

yyjjww commented 2 months ago

但是我用官方的may的模型进行推理,也没有人脸

https://github.com/user-attachments/assets/27dde01b-193f-49b8-bfef-9f734ab56148

lqql8790229 commented 2 months ago

我这边没有这个问题。

AyushUnleashed commented 2 months ago

But I used the official may model for inference and there was no face

ngp_ep0019.mp4

@yyjjww Here's the solution, do all the steps:

I also had the same issue -> Problem in my case was that I was not doing all the steps & input video was not 25 fps, ( you can use online converter for that )

There are 3 steps.

  1. Processing of the video
  2. Training the model
  3. Final inference.

I was confusing processing & training & that's why I was having this problem.

eg code from my case, I created folder name 'head_sara' video: head_sara.mp4 -> 25 fps , 24 sec video, 1080x1080 video of face I used. ( I think with 512 x 512 I may get better result )

if you won't use 25fps video, you'll get error in training step

python data_utils/process.py data/head_sara/head_sara.mp4 --asr ave

Process the video

python data_utils/process.py data/head_sara/head_sara.mp4 --asr ave

Train the model from the processed video

python main.py data/head_sara --workspace model/head_sara -O --iters 60000 --asr_model ave
python main.py data/head_sara --workspace model/head_sara -O --iters 100000 --finetune_lips --patch_size 64 --asr_model ave

inference with audio

python main.py data/head_sara --workspace model/head_sara -O --test --test_train --asr_model ave --portrait --aud data/head_sara/head_sara.wav