blurry inference result even if input video is 1080p

OpenTalker / video-retalking

[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

https://opentalker.github.io/video-retalking/

Apache License 2.0

6.36k stars 943 forks source link

blurry inference result even if input video is 1080p #23

Open GregoryZeng opened 1 year ago

GregoryZeng commented 1 year ago

Hi, this is an awesome project and the lipsync is really good.

I’ve encountered some problems: if I use a high resolution video (1920*1080) as input, the output video is blurry on the whole (not just the face area) though the output resolution is also 1080p. It seems that the output video is scaled up from a low-res one.

Based on my understanding of the paper, the generated talking face is pasted back onto the original video. So I wonder if this global blurriness is normal…

I used the command from the readme for inference. Not sure if there are other options I missed.

python3 inference.py \
  --face examples/face/1.mp4 \
  --audio examples/audio/1.wav \
  --outfile results/1_1.mp4

Thank you for your great repo.

kunncheng commented 1 year ago

Thanks for pointing that out. I will check the cause of this problem.

marcelgoya commented 1 year ago

I am experiencing the same problem. That's how it looks:

https://user-images.githubusercontent.com/3046751/233503155-f7c66d4a-e7d3-4773-8d52-43a6a99c749d.mp4

labels168 commented 1 year ago

感謝您指出了這一點。我會檢查這個問題的原因。

This happened to me too, and the surroundings became giant teeth, please help to solve the problem, thank you very much!

xuguozhi commented 1 year ago

hobinson commented 1 year ago

The same problem

LittleTerry commented 1 year ago

same here

zhixingrenai commented 1 year ago

me too

henrycjh commented 1 year ago

same here @kunncheng

yyz845935161 commented 1 year ago

me too，how to improve video

xuelinghao commented 1 year ago

我这也一样，720P的片子都会碰到问题，希望可以帮助解决，谢谢

yyz845935161 commented 1 year ago

After a few days of hard work, I visualized every variable in the code by going through it line by line. The easiest way to do this is: Lines 251-265 in [video-retalking/inference.py] are changed to the following code, which is commented in all other codes. This method does not compress the picture, but restorer.enhance method enhances the graphic.

           cropped_faces, restored_faces, restored_img = restorer.enhance(
                ff, has_aligned=False, only_center_face=True, paste_back=True)
            out.write(restored_img)

like this !

Inferencer commented 1 year ago

can your share result using that please, saves me reinstalling

labels168 commented 1 year ago

經過幾天的努力，我將代碼中的每個變量逐行查看可視化。最簡單的方法是：將 [video-retalking/inference.py] 中的第 251-265 行更改為以下代碼，該代碼在所有其他代碼中都有註釋。該方法不壓縮圖片，restorer.enhance方法是對圖片進行增強。
          cropped_faces, restored_faces, restored_img = restorer.enhance(
               ff, has_aligned=False, only_center_face=True, paste_back=True)
           out.write(restored_img)
像這樣！

Can you provide the installer, or provide the code so we can update it? Thank you!

LittleTerry commented 1 year ago

After a few days of hard work, I visualized every variable in the code by going through it line by line. The easiest way to do this is: Lines 251-265 in [video-retalking/inference.py] are changed to the following code, which is commented in all other codes. This method does not compress the picture, but restorer.enhance method enhances the graphic.
          cropped_faces, restored_faces, restored_img = restorer.enhance(
               ff, has_aligned=False, only_center_face=True, paste_back=True)
           out.write(restored_img)
like this !

您好，我这边经过尝试，您提供的代码确实可以运行，但效果是：在人脸处会出现正方形。请问有办法改进吗？谢谢！

nagaki09 commented 1 year ago

After a few days of hard work, I visualized every variable in the code by going through it line by line. The easiest way to do this is: Lines 251-265 in [video-retalking/inference.py] are changed to the following code, which is commented in all other codes. This method does not compress the picture, but restorer.enhance method enhances the graphic.
          cropped_faces, restored_faces, restored_img = restorer.enhance(
               ff, has_aligned=False, only_center_face=True, paste_back=True)
           out.write(restored_img)
like this !

清晰度问题确实得到了解决，但是人脸处会出现方框

QuantJia commented 1 year ago

TalhaaaYaqoob commented 1 year ago

wangnan0610 commented 11 months ago

jedisun76cn commented 11 months ago

After a days of hard work, I debuged and traced the code by going through it line by line. i solved the issue by modifying only one-line code like belowed:

pp, orig_faces, enhanced_faces = enhancer.process(pp, xf, bbox=c, face_enhance=True, possion_blending=False)

KartavyaBagga commented 11 months ago

After a days of hard work, I debuged and traced the code by going through it line by line. i solved the issue by modifying only one-line code like belowed:
pp, orig_faces, enhanced_faces = enhancer.process(pp, xf, bbox=c, face_enhance=True, possion_blending=False) 

I Appreciate your hard work going through the code line by line, did you find something which could make the inference for lip synthesis faster ?

OmriCBT commented 8 months ago

After a days of hard work, I debuged and traced the code by going through it line by line. i solved the issue by modifying only one-line code like belowed:
pp, orig_faces, enhanced_faces = enhancer.process(pp, xf, bbox=c, face_enhance=True, possion_blending=False) 

Thank you! I can confirm this solves the issue of blurry output. But, I personally don't like the face enhancement as it is too aggressive and changes the original face too much. If you wish to avoid this result as well - leave face_enhace as "False" and just change possion_blending to "False". it still loses some of the original video's quality but the box around the face is solved.