lelechen63 / Talking-head-Generation-with-Rhythmic-Head-Motion

Other
197 stars 28 forks source link

basic information about test demo #3

Closed Adorablepet closed 3 years ago

Adorablepet commented 3 years ago

Hi, Thanks for your sharing code. This is a very good project. You can provide a requirement.txt so that other people can install the virtual environment. When I run test_demo_ani.py, encountered two problems:

  1. in loss_collector.py, importerror: cannot import name 'msssim' from 'pytorch_msssim. I modified from pytorch_msssim import ms_ssim
  2. in keypoint2img.py numpy.float64' object cannot be interpreted as an integer. I modified 311 line that curve_x = np.linspace(int(x[0]), int(x[-1]), (int(x[-1])-int(x[0])))
cd Talking-head-Generation-with-Rhythmic-Head-Motion
mkdir -p checkpoints/face8_voc_demo

pretrained weight should be placed in directory ./checkpoints/face8_voc_demo

the detail script is shown as below:

CUDA_VISIBLE_DEVICES=[CUDA Ids] python test_demo_ani.py \
--name face8_vox_demo \
--dataset_mode facefore_demo \
--adaptive_spade \
--warp_ref \
--warp_ani \
--add_raw_loss \
--spade_combine \
--example \
--n_frames_G 1 \
--which_epoch latest \
--how_many 50 \
--nThreads 0 \
--dataroot 'demo' \
--ref_img_id "0" \
--n_shot 8 \
--serial_batches \
--dataset_name vox \
--crop_ref \
--use_new

In ./extra_degree_result_ani/face8_vox_demo/8_shot_epoch_latest/00181_aligned directory generated videos. In./extra_degree_result_ani/face8_vox_demo/8_shot_epoch_latest/00201_aligneddirectory generated videos.

After listening to the generated video, I did not find any problems. Any error in the things I modified?

How can I test my own video? Are there any requirements for the video size? Does it also include background information when it is generated? If it is the mp4 I generated, how can I save it as an image? Looking forward to your reply, thanks.

dipam7 commented 3 years ago

I'm getting the following error,

ModuleNotFoundError: No module named 'data.fewshot_pose_dataset'

Is the file not uploaded? Can you also post the versions of all the libraries used or a requirements.txt? That'll be really helpful

rainingDesert commented 3 years ago

Hi, Thanks for your sharing code. This is a very good project. You can provide a requirement.txt so that other people can install the virtual environment. When I run test_demo_ani.py, encountered two problems:

  1. in loss_collector.py, importerror: cannot import name 'msssim' from 'pytorch_msssim. I modified from pytorch_msssim import ms_ssim
  2. in keypoint2img.py numpy.float64' object cannot be interpreted as an integer. I modified 311 line that curve_x = np.linspace(int(x[0]), int(x[-1]), (int(x[-1])-int(x[0])))
cd Talking-head-Generation-with-Rhythmic-Head-Motion
mkdir -p checkpoints/face8_voc_demo

pretrained weight should be placed in directory ./checkpoints/face8_voc_demo

the detail script is shown as below:

CUDA_VISIBLE_DEVICES=[CUDA Ids] python test_demo_ani.py \
--name face8_vox_demo \
--dataset_mode facefore_demo \
--adaptive_spade \
--warp_ref \
--warp_ani \
--add_raw_loss \
--spade_combine \
--example \
--n_frames_G 1 \
--which_epoch latest \
--how_many 50 \
--nThreads 0 \
--dataroot 'demo' \
--ref_img_id "0" \
--n_shot 8 \
--serial_batches \
--dataset_name vox \
--crop_ref \
--use_new

In ./extra_degree_result_ani/face8_vox_demo/8_shot_epoch_latest/00181_aligned directory generated videos. In./extra_degree_result_ani/face8_vox_demo/8_shot_epoch_latest/00201_aligneddirectory generated videos.

After listening to the generated video, I did not find any problems. Any error in the things I modified?

How can I test my own video? Are there any requirements for the video size? Does it also include background information when it is generated? If it is the mp4 I generated, how can I save it as an image? Looking forward to your reply, thanks.

Thanks for sharing your experiment with us. And also thanks for your suggestion about requirement.txt. We will work on that later.

For the pytorch_msssim error, I double checked the import and only find msssim instead of ms_ssim, which may caused by the version and should be the same. But loss_collector.py is only used during training for calculating loss, which would not affect testing result. For the problem in keypoint2img.py, I think you are right but I didn't meet this problem in my experiment.

To test your own video, several required document needed to place in directory specified by flag '--dataroot'. You can check the required document in Readme. As for the size of video, we will resize images into 256X256 in dataloader. It would be better if you can crop frames into this size. During training, background information will be auto-extracted by model according to reference image, and also warped image can guide background part in final synthesized step. Actually in the result directory, we save both images and video. If you want to modify the saved image or video, you can check line 138-172 in file test_demo_ani.py.

Hope this can help your experiment. Please contact me if you need more information.

rainingDesert commented 3 years ago

I'm getting the following error,

ModuleNotFoundError: No module named 'data.fewshot_pose_dataset'

Is the file not uploaded? Can you also post the versions of all the libraries used or a requirements.txt? That'll be really helpful

Thanks for reaching us.

Such error is caused by default value of flag '--dataset_mode', which specifies the name of dataloader file. Several flags are recommended to be redefined including this, which you can check in train_g8.sh or test_demo.sh. Additionally, we will modify the default value in our code.

Also thanks for your suggestion, we will work on requirement later. Please contact us if you need more information.

Adorablepet commented 3 years ago

Hi, Thanks for your sharing code. This is a very good project. You can provide a requirement.txt so that other people can install the virtual environment. When I run test_demo_ani.py, encountered two problems:

  1. in loss_collector.py, importerror: cannot import name 'msssim' from 'pytorch_msssim. I modified from pytorch_msssim import ms_ssim
  2. in keypoint2img.py numpy.float64' object cannot be interpreted as an integer. I modified 311 line that curve_x = np.linspace(int(x[0]), int(x[-1]), (int(x[-1])-int(x[0])))
cd Talking-head-Generation-with-Rhythmic-Head-Motion
mkdir -p checkpoints/face8_voc_demo

pretrained weight should be placed in directory ./checkpoints/face8_voc_demo the detail script is shown as below:

CUDA_VISIBLE_DEVICES=[CUDA Ids] python test_demo_ani.py \
--name face8_vox_demo \
--dataset_mode facefore_demo \
--adaptive_spade \
--warp_ref \
--warp_ani \
--add_raw_loss \
--spade_combine \
--example \
--n_frames_G 1 \
--which_epoch latest \
--how_many 50 \
--nThreads 0 \
--dataroot 'demo' \
--ref_img_id "0" \
--n_shot 8 \
--serial_batches \
--dataset_name vox \
--crop_ref \
--use_new

In ./extra_degree_result_ani/face8_vox_demo/8_shot_epoch_latest/00181_aligned directory generated videos. In./extra_degree_result_ani/face8_vox_demo/8_shot_epoch_latest/00201_aligneddirectory generated videos. After listening to the generated video, I did not find any problems. Any error in the things I modified? How can I test my own video? Are there any requirements for the video size? Does it also include background information when it is generated? If it is the mp4 I generated, how can I save it as an image? Looking forward to your reply, thanks.

Thanks for sharing your experiment with us. And also thanks for your suggestion about requirement.txt. We will work on that later.

For the pytorch_msssim error, I double checked the import and only find msssim instead of ms_ssim, which may caused by the version and should be the same. But loss_collector.py is only used during training for calculating loss, which would not affect testing result. For the problem in keypoint2img.py, I think you are right but I didn't meet this problem in my experiment.

To test your own video, several required document needed to place in directory specified by flag '--dataroot'. You can check the required document in Readme. As for the size of video, we will resize images into 256X256 in dataloader. It would be better if you can crop frames into this size. During training, background information will be auto-extracted by model according to reference image, and also warped image can guide background part in final synthesized step. Actually in the result directory, we save both images and video. If you want to modify the saved image or video, you can check line 138-172 in file test_demo_ani.py.

Hope this can help your experiment. Please contact me if you need more information.

@rainingDesert Thanks, I will try. If encountered problem during the test, I will continue to ask questions. This is a very good project.

Adorablepet commented 3 years ago

@lelechen63 @rainingDesert The video resolution I tested is 1280x720(00001.mp4).When I run single_video_preprocess.py and get_3d.py and find_camera.py, some files are generated. as follows:

00001__00384.png   00001__crop.mp4   00001_ani.mp4   00001__front.npy   00001__original.npy  00001__original.obj  00001__prnet.npy   00001__rt.npy

several required document needed to place in directory specified by flag '--dataroot', as follows:

cp 00001__00384.png 00001_00384.png
cp 00001__crop.mp4 00001_aligned.mp4
cp 00001__ani.mp4 00001_aligned_ani.mp4
cp 00001__rt.npy 00001_aligned_rt.npy
cp 00001__front.npy 00001_aligned_front.npy
cp 00001__original.npy 00001_aligned.npy

my audio is 00001.wav, (35s), but generated videos has 38s. (256x256), 00001.wav is an arbitrary audio file, not an audio file extracted from the original video.Could you tell which link has the problem? How do I want to generate the same resolution of 1280x720 as the source video, and add the background of the video, what do I need to do? If I input an arbitrary voice and a video of about 3s, Can a head movement video with arbitrary voice be generated? Thanks.

Adorablepet commented 3 years ago

@rainingDesert @lelechen63 vox myvideo

This is a comparison between the video file you provided and the picture generated by my own video. My video is in Chinese, but it feels that there is a big gap between the effect and yours. Does this have a lot to do with language? Can you help me see where I am going wrong, thanks.

lelechen63 commented 3 years ago

@rainingDesert @lelechen63 vox myvideo

This is a comparison between the video file you provided and the picture generated by my own video. My video is in Chinese, but it feels that there is a big gap between the effect and yours. Does this have a lot to do with language? Can you help me see where I am going wrong, thanks.

The black region is caused by the dataset problem. The dataset we trained on does not include the upper head part (hear region). So your cropping is not correct. You need to use the cropping function provided in the data folder. If you want to test on full image, you need to retrain the network. Curretnly, our model will not generate the upper head region.