zhanglonghao1992 / One-Shot_Free-View_Neural_Talking_Head_Synthesis

Pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing"
Other
764 stars 143 forks source link

运行demo.py生成的result.mp4是一张图片播放数秒,不会动 #8

Open Vijayue opened 2 years ago

Vijayue commented 2 years ago

大佬您好,感谢您的开源~

目前从voxceleb2里截取了几百个视频进行训练,然后存储模型

使用训练的模型运行demo.py,生成的MP4文件是一张图片播放了几秒钟,人物并不会动,请问您对这个问题有什么想法吗?

zhanglonghao1992 commented 2 years ago

有没有先用crop-video.py处理一下driving video呢。最好把你的source, driving和result传上来我看下

Vijayue commented 2 years ago

刚才没有用crop_video.py处理,但处理后好像还是没有变化

source_image:

0

driving_video:

https://user-images.githubusercontent.com/41794907/132471510-096a5a9f-636e-47f0-bfbf-246ba5452919.mp4

result_video:

https://user-images.githubusercontent.com/41794907/132471770-6ca4c503-d505-4a35-b693-44d20a96fc7e.mp4

zhanglonghao1992 commented 2 years ago

@Vijayue This is what I got with your source image and driving video:

https://user-images.githubusercontent.com/17874285/132477992-2703d2db-503e-4e4d-9041-a33d111b2d25.mp4

set yaw = -30:

https://user-images.githubusercontent.com/17874285/132480069-e98c4ebc-348d-44e1-868e-c240d693efb4.mp4

Vijayue commented 2 years ago

update: The keypoints_driving extracted is not wrong. I found the bug is in generator. The generator can't output different results using different kp_driving. I try to input different kp_driving to generator, but got the same deformed image.

Thank you for your test!

I found keypoints_driving from different driving frames are the same (or close). Now I try to fix the error. I guess maybe the model trained is not good. It couldn't extract the correct keypoints from different frames. Maybe the training dataset is too small, I just use about a hundred videos.

panzhang0104 commented 2 years ago

I train the model using all voxceleb, It seems the output is the same as the source image not the driving image. image

Vijayue commented 2 years ago

@panzhang0104 I have settled the problem and found it's not the keypoints problem finally. In the early epochs, the keypoints are not accurate. You maybe can try to train more time.

panzhang0104 commented 2 years ago

Many Thanks!

Vijayue commented 2 years ago

@zhanglonghao1992 Excuse me, I experimented many times in different hyperparameters, but some losses are stable in all the training processes. The training is really hard. Would you please share your training log for my reference, thank you a lot.

DWCTOD commented 2 years ago

@panzhang0104 I have settled the problem and found it's not the keypoints problem finally. In the early epochs, the keypoints are not accurate. You maybe can try to train more time.

您好,大佬训练了多久才可以。我这边用400个左右的视频训练了150个epoch,依然不动(已经训练了快一周了)。

panzhang0104 commented 2 years ago

我不是train的video,我是抽frame,大概200W张,train了一个epoch

vicdxxx commented 2 years ago

@Vijayue This is what I got with your source image and driving video:

concat-18.mp4 set yaw = -30:

yaw_-30.mp4

I use your beta model but cannot get good performance as you Is model or config not match? python demo.py --result_video data/result.mp4 --config config/vox-256.yaml --checkpoint checkpoint/15kp-ep119.pth.tar --source_image data/man.png --driving_video data/man.mp4 --relative --adapt_scale --find_best_frame

https://user-images.githubusercontent.com/14140837/140889457-99387295-36fc-4565-ab22-25c6a52d5a1a.mp4

zhanglonghao1992 commented 2 years ago

@vicdxxx Make sure that the following functions in demo.py and animate.py belong to the beta version

  1. get_rotation_matrix()
  2. keypoint_transformation()
  3. normalize_kp()

I just # the above functions in the new version.

vicdxxx commented 2 years ago

@vicdxxx Make sure that the following functions in demo.py and animate.py belong to the beta version

  1. get_rotation_matrix()
  2. keypoint_transformation()
  3. normalize_kp()

I just # the above functions in the new version.

I use beta verision of these functions, which did work. Thanks Could I ask what is major diff of you new version with beta version functions? and What do you think the potential of 256x256 resolution -> 512x512 or larger resolution?

vicdxxx commented 2 years ago

@vicdxxx Make sure that the following functions in demo.py and animate.py belong to the beta version

  1. get_rotation_matrix()
  2. keypoint_transformation()
  3. normalize_kp()

I just # the above functions in the new version.

I use beta verision of these functions, which did work. Thanks Could I ask what is major diff of you new version with beta version functions? and What do you think the potential of 256x256 resolution -> 512x512 or larger resolution?

https://user-images.githubusercontent.com/14140837/140904658-476ad0bd-dc4a-44ba-8a61-1b0f35fe1d1d.mp4

zhanglonghao1992 commented 2 years ago

@vicdxxx You can refer to the [update] in README to follow up the new version.

Vijayue commented 2 years ago

@tyrink 不好意思,我已经忘记了较早应该是什么样子的,理论上是否应该和source image比较相似?

关于source没有被driving驱动,我现在觉得可能和训练时间有关系,也可能有别的原因……这个不太好判断。

作者人超级nice的在不断更新模型,用作者分享的模型呢🤔🤔