Rudrabha / Wav2Lip

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs
https://synclabs.so
10.4k stars 2.23k forks source link

Few Faces on the screen - How can I choose? + None 90 Degrees issue fixed? #192

Closed AlonDan closed 3 years ago

AlonDan commented 3 years ago

On a video with 2, 3 or more faces on the screen (for example: talk show where all faces are possible targets)

1. Is there a command I can add -something like the use of pads? so I can PICK / SELECT which face to be the target for the process? Or can I do the opposite? which faces to ignore?

2. I did asked this last year but I'm not sure if it's possible already or not: When none 90 degrees faces (or close at least) the results are not accurate and it seems like the mouth result on the video will always be only around 90 degrees. so the result video will look distorted or won't fit the face.

Is this fix already in and I need to put a specific -command ? or is that feature not fixed yet?

-

I'm not a programmer, but would like to experiment on Wav2Lip if somebody can please explain step-by-step or with an example, it will be very appreciated and helpful for other people as well.

Thanks ahead! 👍

prajwalkr commented 3 years ago

1. Is there a command I can add -something like the use of pads? so I can PICK / SELECT which face to be the target for the process?

https://github.com/Rudrabha/Wav2Lip/blob/143e969123218405b74b7d330a09c5c063fe17db/inference.py#L38

So you can specify the crop coordinates for the frame, within which you expect your target face to be present throughout the video. For example, your target face could always in the left half of the video.

AlonDan commented 3 years ago

Thanks for the reply, Just for testing, I tried this for example: --crop 0 -10 0 -10

But it didn't recognize the face (or any of the multiple faces) and I also tried other videos and different resolutions as well just to be sure it's not the problem.

Without --crop it recognize one of the faces.

As I mentioned, I'm not a programmer and I probably do something wrong with the commands. Sorry for the confusion, I'm experimenting with it. and the cropping sounds really good if I'll understand how to make it work.

EDIT: I finally got the crop to work, but... I guess it's not what I was looking for, since it's cropping the actual final size of the video (frame size).

So I'm wondering if there is a command to SELECT / PICK specific face to recognize without cropping the video?

This is what I'm trying to do: That way I can run it few times, each time on other face + other audio (per person) Then combine/mix multiple faces (video versions) together to one video with multiple audio tracks. This is why I'm looking for a hidden command (if there is such) to SELECT which face I want to be recognize and match the current audio.

Since the face are not moving a lot, I've tried: --box but no matter what value I put, I get an error, I guess it's not for region detection.

-

I'm not sure what nargs='+' is for, or how to use it, can you please give an example full command so I can "copy past" to try? it will be very helpful thanks ahead!

prajwalkr commented 3 years ago

So I'm wondering if there is a command to SELECT / PICK specific face to recognize without cropping the video?

Preserving the complete frame and select identities is still not supported in the code. Sorry. You can take the cropped segment and use a video editor to replace the generated crop later.

AlonDan commented 3 years ago

So I'm wondering if there is a command to SELECT / PICK specific face to recognize without cropping the video?

Preserving the complete frame and select identities is still not supported in the code. Sorry. You can take the cropped segment and use a video editor to replace the generated crop later.

Thanks for the reply :) I understand, will it be a feature soon or not in the plan at all?

Also, I think I can try your suggestion but the problem is that I'm loosing quality on every export. I did open a MP4-CRF value related issue hope you'll be able to help in that for better quality.

Thanks ahead!

prajwalkr commented 3 years ago

I understand, will it be a feature soon or not in the plan at all?

It is not in the plan yet.