anothermartz / Easy-Wav2Lip

Colab for making Wav2Lip high quality and easy to use
495 stars 76 forks source link

Contents:

  1. Introduction
  2. Google Colab version (free cloud computing in-browser)
  3. Local Installation
  4. Support
  5. Best Practices

    Easy-Wav2Lip improves Wav2Lip video lipsyncing making it:

Easier:

Faster:

For my 9 second 720p 60fps test clip via Colab T4: Original Wav2Lip Easy-Wav2Lip
Execution time: 6m 53s Execution time: 56s

That's not a typo! My clip goes from almost 7 minutes to under 1 minute!

The tracking data is saved between generations of the same video, saving even more time: Easy-Wav2Lip on the same video again
Execution time: 25s

Better looking:

Easy-Wav2Lip fixes visual bugs on the lips:

Comparison gif

3 Options for Quality:

Comparison gif

Installation:

For the easiest and most compatible way to use this tool, use the Google Colab version:

Google Colab:

https://colab.research.google.com/github/anothermartz/Easy-Wav2Lip/blob/v8.3/Easy_Wav2Lip_v8.3.ipynb

Open In Colab

Local Installation:

Requirements: Nvidia card that supports cuda 12.2 Or MacOS device that supports mps via Apple silicon or AMD GPU

Automatic installation for Windows 64-bit and x86 processor:

  1. Download Easy-Wav2Lip.bat
  2. Place it in a folder on your PC (EG: in Documents)
  3. Run it and follow the instructions. It will make a folder called Easy-Wav2Lip within whatever folder you run it from.
  4. Run this file whenever you want to use Easy-Wav2Lip

This should handle the installation of all required components.

Manual installation:

  1. Make sure the following are installed and can be accessed via your terminal:

    • Python 3.10 (I have only tested 3.10.11 - other versions may not work!)
    • Git
    • Windows & Linux: Cuda (Just having the latest Nvidia drivers will do this, I have only tested 12.2)
  2. Run the following in your terminal once you've navigated to the folder you want to install Easy-Wav2Lip:

Windows manual installation:

Sets up a venv, installs ffmpeg to it and then installs Easy-Wav2Lip:

  1. Open cmd and navigate to the folder you want to install EasyWav2Lip using cd EG: cd Documents

  2. Copy and paste the following code into your cmd window: Note: 2 folders will be made in this location: Easy-Wav2Lip and Easy-Wav2Lip-venv (an isolated python install)

    py -3.10 -m venv Easy-Wav2Lip-venv
    Easy-Wav2Lip-venv\Scripts\activate
    python -m pip install --upgrade pip
    python -m pip install requests
    set url=https://github.com/BtbN/FFmpeg-Builds/releases/download/latest/ffmpeg-master-latest-win64-gpl.zip
    python -c "import requests; r = requests.get('%url%', stream=True); open('ffmpeg.zip', 'wb').write(r.content)"
    powershell -Command "Expand-Archive -Path .\\ffmpeg.zip -DestinationPath .\\"
    xcopy /e /i /y "ffmpeg-master-latest-win64-gpl\bin\*" .\Easy-Wav2Lip-venv\Scripts
    del ffmpeg.zip
    rmdir /s /q ffmpeg-master-latest-win64-gpl
    git clone https://github.com/anothermartz/Easy-Wav2Lip.git
    cd Easy-Wav2Lip
    pip install -r requirements.txt
    python install.py

    Now to run Easy-Wav2Lip:

    1. Close and reopen cmd then cd to the same directory as in Step 1.
    2. Paste the following code:
      Easy-Wav2Lip-venv\Scripts\activate
      cd Easy-Wav2Lip
      call run_loop.bat

      See Usage for further instructions.

MacOS and Linux installation (untested):

Sets up a venv, installs ffmpeg to it and then installs Easy-Wav2Lip:

  1. Open Terminal and navigate to the folder you want to insteall Easy0Wav2Kip using cd EG: cd ~/Documents

  2. Copy and paste the following code into your terminal window: Note: 2 folders will be made in this location: Easy-Wav2Lip and Easy-Wav2Lip-venv (an isolated python install)

    python3.10 -m venv Easy-Wav2Lip-venv
    source EW2Lvenv/bin/activate
    python -m pip install --upgrade pip
    python -m pip install requests
    for file in ffmpeg ffprobe ffplay; do
    curl -O "https://evermeet.cx/ffmpeg/${file}-6.1.1.zip"
    unzip "${file}-6.1.1.zip"
    done
    mv -f ffmpeg ffprobe ffplay /Easy-Wav2Lip-venv/bin/
    rm -f ffmpeg-6.1.1.zip ffprobe-6.1.1.zip ffplay-6.1.1.zip
    source EW2Lvenv/bin/activate
    git clone https://github.com/anothermartz/Easy-Wav2Lip.git
    cd Easy-Wav2Lip
    pip install -r requirements.txt
    python install.py

    Now to run Easy-Wav2Lip:

  3. Close and reopen terminal then cd to the same directory as in Step 1.

  4. Paste the following code:

    source Easy-Wav2Lip-venv/bin/activate
    cd Easy-Wav2Lip
    ./run_loop.sh

Usage:

Credits:

Support

If you're having issues running this, please look through the issues tab to see if someone has written about it. If not, make a new thread but make sure you include the following:

If colab:

Without this info, I'll just ask for it anyway and so a response about the issue itself will take longer.

Chances are that if any of those are different from the requirements then that's the reason it's not working and you may just have to use the colab version if not already.

For general chit chat about this and any other lipsync talk, I'll be in this discord:
Invite link: https://discord.gg/FNZR9ETwKY
Wav2Lip channel: https://discord.com/channels/667279414681272320/1076077584330280991

Best practices:

Video files:

Audio files:

Advanced Tweaking:

wav2lip_version:

Option Pros Cons
Wav2Lip + More accurate lipsync
+ Attempts to keep the mouth closed when there is no sound
- Sometimes produces missing teeth (uncommon)
Wav2Lip_GAN + Looks nicer
+ Keeps the original expressions of the speaker more
- Not as good at masking the original lip movements, especially when there is no sound

I suggest trying Wav2Lip first and switching to the GAN version if you experience an effect where it looks like the speaker has big gaps in their teeth.

nosmooth:

Padding:

This option controls how many pixels are added or removed from the face crop in each direction.

Value Example Effect
U U = -5 Removes 5 pixels from the top of the face
D D = 10 Adds 10 pixels to the bottom of the face
L L = 0 No change to the left of the face
R R = 15 Adds 15 pixels to the right of the face

Padding can help remove hard lines at the chin or other edges of the face, but too much or too little padding can change the size or position of the mouth. It's common practice to add 10 pixels to the bottom, but you should experiment with different values to find the best balance for your clip.

Mask:

This option controls how the processed face is blended with the original face. This has no effect on the "Fast" quality option.

Other options:

Batch processing:

This option allows you to process multiple video and/or audio files automatically.

output_suffix:

This adds a suffix to your output files so that they don't overwite your originals.

include_settings_in_suffix:

Adds what settings were used - good for comparing different settings as you will know what you used for each render. Will add: Qualty_resolution_nosmooth_pads-UDLR EG: _Enhanced_720_nosmooth1_pads-U15D10L-15R30 pads_UDLR will not be included if they are set to 0. resolution will not be included if it output_height is set to full resolution

preview_input

Displays the input video/audio before processing so you can check to make sure you chose the correct file(s). It may only work with .mp4, I just know it didn't work on an .avi I tried. Disabling this will save a few seconds of processing time for each video.

preview_settings

This will render only 1 frame of your video and display it at full size, this is so you can tweak the settings without having to render the entire video each time. frame_to_preview is for selecting a particular frame you want to check out - may not be completely accurate to the actual frame.