Closed anguoKuang closed 3 years ago
老铁,提问题的时候能不能提供稍微详细一点的情况(运行的命令,配置,参数,源视频等等),什么都没有我们怎么帮你分析原因...
1.1 Preprocessing, running Preprocessor to detect the human boxes of ../results\primitives\donald_trump_2\processed\orig_images...
100%|██████████| 1/1 [01:13<00:00, 73.53s/it]
1.1 Preprocessing, finish detect the human boxes of ../results\primitives\donald_trump_2\processed\orig_images ...
1.2 Preprocessing, cropping all images in ../results\primitives\donald_trump_2\processed\orig_images by estimated boxes ...
1it [00:01, 1.90s/it]
0%| | 0/1 [00:00<?, ?it/s] 1.2 Preprocessing, finish crop the human by boxes, and save them in ../results\primitives\donald_trump_2\processed\images ...
1.3 Preprocessing, running Preprocessor to 3D pose estimation of all images in../results\primitives\donald_trump_2\processed\images ...
100%|██████████| 1/1 [00:09<00:00, 9.46s/it]
1.3 Preprocessing, finish 3D pose estimation successfully ....
1.4 Preprocessing, running Preprocessor to find 25 candidates front images in ../results\primitives\donald_trump_2\processed\images ...
0%| | 0/1 [00:00<?, ?it/s] 1.4 Preprocessing, finish find the front images ....
100%|██████████| 1/1 [00:00<00:00, 1.04it/s]
1.5 Preprocessing, running Preprocessor to run human matting in ../results\primitives\donald_trump_2\processed\parse ...
0%| | 0/1 [00:00<?, ?it/s]A:\Anaconda\envs\IPERcore-main\lib\site-packages\torch\nn\functional.py:3000: UserWarning: The default behavior for interpolate/upsample with float scale_factor changed in 1.6.0 to align with other frameworks/libraries, and uses scale_factor directly, instead of relying on the computed output size. If you wish to keep the old behavior, please set recompute_scale_factor=True. See the documentation of nn.Upsample for details.
warnings.warn("The default behavior for interpolate/upsample with float scale_factor changed "
A:\Anaconda\envs\IPERcore-main\lib\site-packages\mmedit\models\common\gca_module.py:244: UserWarning: Mixed memory format inputs detected while calling the operator. The operator will output contiguous tensor even if some of the inputs are in channels_last format. (Triggered internally at ..\aten\src\ATen\native\TensorIterator.cpp:918.)
out = out + self_mask unknown_ps
1.5 Preprocessing, finish run human matting.
100%|██████████| 1/1 [00:00<00:00, 1.01it/s]
1.6 Preprocessing, running Preprocessor to run background inpainting ...
0%| | 0/1 [00:00<?, ?it/s] 1.6 Preprocessing, finish run background inpainting ....
100%|██████████| 1/1 [00:00<00:00, 1.43it/s]
1.7 Preprocessing, saving visualization to ../results\primitives\donald_trump_2\processed\visual.mp4 ...
A:\py_workspace\iPERCore-main2\iPERCore-main\iPERCore\tools\utils\visualizers\smpl_visualizer.py:57: UserWarning: Mixed memory format inputs detected while calling the operator. The operator will output channels_last tensor even if some of the inputs are not in channels_last format. (Triggered internally at ..\aten\src\ATen\native\TensorIterator.cpp:924.)
masked_img = imgs (1 - sil) + rd_imgs * sil
100%|██████████| 1/1 [00:00<00:00, 3.87it/s]
../assets/executables/ffmpeg-4.3.1-win64-static/bin/ffmpeg.exe -y -i ../results\primitives\donald_trump_2\processed\visual.mp4.avi -vcodec h264 ../results\primitives\donald_trump_2\processed\visual.mp4 -loglevel quiet
1.7 Preprocessing, saving visualization to ../results\primitives\donald_trump_2\processed\visual.mp4 ...
Preprocessor has finished...
../assets/samples/references/akun_2.mp4 Writing frames to file
../assets/executables/ffmpeg-4.3.1-win64-static/bin/ffmpeg.exe -i ../assets/samples/references/akun_2.mp4 -start_number 0 ../results\primitives\akun_2\processed\origimages/frame%08d.png
ffmpeg version 4.3.1 Copyright (c) 2000-2020 the FFmpeg developers
built with gcc 10.2.1 (GCC) 20200726
configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libsrt --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libgsm --disable-w32threads --enable-libmfx --enable-ffnvcodec --enable-cuda-llvm --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt --enable-amf
libavutil 56. 51.100 / 56. 51.100
libavcodec 58. 91.100 / 58. 91.100
libavformat 58. 45.100 / 58. 45.100
libavdevice 58. 10.100 / 58. 10.100
libavfilter 7. 85.100 / 7. 85.100
libswscale 5. 7.100 / 5. 7.100
libswresample 3. 7.100 / 3. 7.100
libpostproc 55. 7.100 / 55. 7.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '../assets/samples/references/akun_2.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.45.100
Duration: 00:00:07.34, start: 0.000000, bitrate: 1673 kb/s
Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], 1543 kb/s, 30 fps, 30 tbr, 15360 tbn, 60 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 127 kb/s (default)
Metadata:
handler_name : SoundHandler
Stream mapping:
Stream #0:0 -> #0:0 (h264 (native) -> png (native))
Press [q] to stop, [?] for help
Output #0, image2, to '../results\primitives\akun_2\processed\origimages/frame%08d.png':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.45.100
Stream #0:0(eng): Video: png, rgb24, 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 30 fps, 30 tbn, 30 tbc (default)
Metadata:
handler_name : VideoHandler
encoder : Lavc58.91.100 png
frame= 219 fps= 49 q=-0.0 Lsize=N/A time=00:00:07.30 bitrate=N/A speed=1.64x
video:106657kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
1.1 Preprocessing, running Preprocessor to detect the human boxes of ../results\primitives\akun_2\processed\orig_images...
100%|█████████▉| 218/219 [00:31<00:00, 6.99it/s] 1.1 Preprocessing, finish detect the human boxes of ../results\primitives\akun_2\processed\orig_images ...
100%|██████████| 219/219 [00:31<00:00, 6.91it/s]
1.2 Preprocessing, cropping all images in ../results\primitives\akun_2\processed\orig_images by estimated boxes ...
219it [00:04, 50.48it/s]
0%| | 0/7 [00:00<?, ?it/s] 1.2 Preprocessing, finish crop the human by boxes, and save them in ../results\primitives\akun_2\processed\images ...
1.3 Preprocessing, running Preprocessor to 3D pose estimation of all images in../results\primitives\akun_2\processed\images ...
100%|██████████| 7/7 [00:12<00:00, 1.80s/it]
1.3 Preprocessing, finish 3D pose estimation successfully ....
Preprocessor has finished...
Pre-processing: digital deformation start...
0%| | 0/1 [00:01<?, ?it/s]
Pre-processing: digital deformation completed...
the current number of sources are 1, while the pre-defined number of sources are 2.
Pre-processing: successfully...
Step 2: running personalization on
Step 3: running imitator done.
Process finished with exit code 0
数据源用的就是特朗普的那张图片 以下是结果的截图,很模糊。求大神指导一下问题出现在哪里呀
@anguoKuang, 256 x 256尺度的结果会稍微模糊一点,一方面是256下,人脸占整张图像的区域比例很小,人脸部分稍模糊;另一方面可能是预训练的base模型是在512 x 512上训练的。如果有条件(显卡显存够的话),切换到512或者更高的尺度。
@anguoKuang, 256 x 256尺度的结果会稍微模糊一点,一方面是256下,人脸占整张图像的区域比例很小,人脸部分稍模糊;另一方面可能是预训练的base模型是在512 x 512上训练的。如果有条件(显卡显存够的话),切换到512或者更高的尺度。
多谢多谢,提升到512后质量果然清晰很多。但是整体看起来很不自然,很多细节处是模糊变形的,不知作者在这方面有没有什么方法可以改善呢?
@anguoKuang Source图像相比于SMPL估计的越准, 效果越好. 如果你使用的source图像身材和真人比例差别过大(如卡通人, 或者穿裙子, 爆炸头等等), 效果就会不好. 此外, 如果你说的是trump那个例子, 那个已经是尽可能调优的结果了, 毕竟只有一张照片.
如果你想制作一个效果尽可能好的demo, 则需要提供视角覆盖范围尽可能多的照片(如正身, 背身, 侧身等等)
老铁, 提问题的时候能不能提供稍微详细一点的情况(运行的命令, 配置, 参数, 源视频等等), 什么都没有我们怎么帮你分析原因...