Some warnings even though it completes the output, poor quality1

ghost commented 4 years ago

I've tried many variations of videos and source images. All images are 256x256. When I try larger images it's usually worse (which is expected I guess) Videos ranging in size and types of expressions, some from Tiktok, some from Giphy etc.

PC Hardware Windows 10 64bit Version 2004 i7 8700k 32gb ram RTX 2070

I get these warnings when it's running. The output quality isn't great, sometimes it's not even doing anything. Example:

Source Video 1: https://media.giphy.com/media/j71DSjhL7YQMw/giphy.mp4
Source Image 1: https://otb.cachefly.net/wp-content/uploads/2016/11/Donald-Trump-256x256.jpg
Sample output 1 (result) : https://imgur.com/bytUkQm
Source Video 2: https://media.giphy.com/media/ZMrYlsQXqkxbO/giphy.mp4
Source Image 2: https://res-3.cloudinary.com/crunchbase-production/image/upload/c_thumb,h_256,w_256,f_auto,g_faces,z_0.7,q_auto:eco/f0fizqoph7jt1bhmq3iy
Sample output 2 (result) : https://imgur.com/2fZVpki

Steps i'm taking:

python crop-video.py --inp driving_video/1.mp4
Then run the FFMPEG command that it "recommends" in the command line to crop the video
python demo.py --config config/vox-adv-256.yaml --driving_video driving_video/crop.mp4 --source_image source_image/1.jpg --checkpoint fom_checkpoints/vox-adv-cpk.pth.tar --relative --adapt_scale

Console results:

python demo.py --config config/vox-adv-256.yaml --driving_video driving_video/crop.mp4 --source_image source_image/1.jpg --checkpoint fom_checkpoints/vox-adv-cpk.pth.tar --relative --adapt_scale
demo.py:27: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  config = yaml.load(f)
C:\Users\XXXX\anaconda3\lib\site-packages\torch\nn\functional.py:3000: UserWarning: The default behavior for interpolate/upsample with float scale_factor changed in 1.6.0 to align with other frameworks/libraries, and uses scale_factor directly, instead of relying on the computed output size. If you wish to keep the old behavior, please set recompute_scale_factor=True. See the documentation of nn.Upsample for details.
  warnings.warn("The default behavior for interpolate/upsample with float scale_factor changed "
  0%|                                                                                           | 0/14 [00:00<?, ?it/s]C:\Users\XXXX\anaconda3\lib\site-packages\torch\nn\functional.py:3384: UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details.
  warnings.warn("Default grid_sample and affine_grid behavior has changed "
C:\Users\XXXX\anaconda3\lib\site-packages\torch\nn\functional.py:3118: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
  warnings.warn("Default upsampling behavior when mode={} is changed "
C:\Users\XXXX\anaconda3\lib\site-packages\torch\nn\functional.py:1625: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
  warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")
100%|██████████████████████████████████████████████████████████████████████████████████| 14/14 [00:00<00:00, 17.66it/s]

bored-guy commented 4 years ago

Your sourse video 1 is too exaggerated for the programme to recgonize where is mouth and where are eyes. Your source video 2 is not clear enough. By the way, I got the same warning, but it seems not to affect the quality.

AliaksandrSiarohin commented 4 years ago

The image and videos need to be cropped, so that face occupy same amount of space like in examples. @bored-guy is right.

zhaochunhui-0723 commented 1 year ago

Hello,my final output result.mp4 has no sound,is it normal

AliaksandrSiarohin / first-order-model

Some warnings even though it completes the output, poor quality1 #259