Open Madhu0820 opened 1 year ago
You should visualize the output of retrained wav2lip model and ENet.
Are these two results normal?
I tried to visualize the video after step 5 which is stored in temp folder but the video is not opening and saying it is too short to open i think I am not getting any video here. Can I Know where the issue is coming
I am getting results when I use this retrained model with wav2lip code
@kunncheng I have visualized the output of retrained wav2lip model it is giving blurness on face so enhancer was not able to identify face. if i use the model which is there in wav2lip repository then also it is giving the same issue . Are you using high quality wav2lip model? if you are using high quality one how can i change the code so that it will work for low quality model as well. iam attaching a screenshotof output of wav2lip model you can refer that
If the wav2lip model still produces this result, it is possible that the model is not loaded correctly, and you should debug the code for model loading.
@kunncheng Model is loading correctly but still getting those results
Hello, has this problem been solved?
I replaced the LNet model with a retrained wav2lip model, and at step 6, it is giving an error like this: [Step 6] Lip Synthesis:: 0% 0/13 [02:26<?, ?it/s] Traceback (most recent call last): File "/content/video-retalking/inference.py", line 342, in
main()
File "/content/video-retalking/inference.py", line 264, in main
pp, orig_faces, enhanced_faces = enhancer.process(pp, xf, bbox=c, face_enhance=False, possion_blending=True)
File "/content/video-retalking/third_part/GPEN/gpen_face_enhancer.py", line 116, in process
mask_sharp = cv2.GaussianBlur(mask_sharp, (0,0), sigmaX=1, sigmaY=1, borderType = cv2.BORDER_DEFAULT)
UnboundLocalError: local variable 'mask_sharp' referenced before assignment
I got to know that this error will come if it is not able to detect faces, but when I run it with the original model which you have provided, I am getting results. So it is not because of the input video. Can I know what the difference might be between your model and the wav2lip model, which is retrained? How can I solve this issue