Rudrabha / Wav2Lip

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs
https://synclabs.so
9.8k stars 2.13k forks source link

Wav2Lip Collab not working anymore -| TypeError: mel() takes 0 positional arguments #471

Open acetra19 opened 1 year ago

acetra19 commented 1 year ago

Hey folks, unfortunately for seemingly no obvious reason at all,

I can not get the wav2lip-collab to work anymore. I never had any problems until a couple of days ago.

Now I always get the following error on the last step (rendering):

Using cuda for inference. Reading video frames... Number of frames available for inference: 353 Traceback (most recent call last): File "/content/Wav2Lip/inference.py", line 280, in main() File "/content/Wav2Lip/inference.py", line 225, in main mel = audio.melspectrogram(wav) File "/content/Wav2Lip/audio.py", line 47, in melspectrogram S = _amp_to_db(_linear_to_mel(np.abs(D))) - hp.ref_level_db File "/content/Wav2Lip/audio.py", line 95, in _linear_to_mel _mel_basis = _build_mel_basis() File "/content/Wav2Lip/audio.py", line 100, in _build_mel_basis return librosa.filters.mel(hp.sample_rate, hp.n_fft, n_mels=hp.num_mels, TypeError: mel() takes 0 positional arguments but 2 positional arguments (and 3 keyword-only arguments) were given

any idea how to troubleshoot this problem? Any help would be appreciated.

Best regards!

bobwatcherx commented 1 year ago

i also get error messsage

even though the duration of the video with sound is the same as 7 seconds

Using cuda for inference.
Reading video frames...
Number of frames available for inference: 1693
Traceback (most recent call last):
  File "/content/Wav2Lip/inference.py", line 280, in <module>
    main()
  File "/content/Wav2Lip/inference.py", line 225, in main
    mel = audio.melspectrogram(wav)
  File "/content/Wav2Lip/audio.py", line 47, in melspectrogram
    S = _amp_to_db(_linear_to_mel(np.abs(D))) - hp.ref_level_db
  File "/content/Wav2Lip/audio.py", line 95, in _linear_to_mel
    _mel_basis = _build_mel_basis()
  File "/content/Wav2Lip/audio.py", line 100, in _build_mel_basis
    return librosa.filters.mel(hp.sample_rate, hp.n_fft, n_mels=hp.num_mels,
TypeError: mel() takes 0 positional arguments but 2 positional arguments (and 3 keyword-only arguments) were given
bobwatcherx commented 1 year ago

Hey folks, unfortunately for seemingly no obvious reason at all,

I can not get the wav2lip-collab to work anymore. I never had any problems until a couple of days ago.

Now I always get the following error on the last step (rendering):

Using cuda for inference. Reading video frames... Number of frames available for inference: 353 Traceback (most recent call last): File "/content/Wav2Lip/inference.py", line 280, in main() File "/content/Wav2Lip/inference.py", line 225, in main mel = audio.melspectrogram(wav) File "/content/Wav2Lip/audio.py", line 47, in melspectrogram S = _amp_to_db(_linear_to_mel(np.abs(D))) - hp.ref_level_db File "/content/Wav2Lip/audio.py", line 95, in _linear_to_mel _mel_basis = _build_mel_basis() File "/content/Wav2Lip/audio.py", line 100, in _build_mel_basis return librosa.filters.mel(hp.sample_rate, hp.n_fft, n_mels=hp.num_mels, TypeError: mel() takes 0 positional arguments but 2 positional arguments (and 3 keyword-only arguments) were given

any idea how to troubleshoot this problem? Any help would be appreciated.

Best regards!

try this colab this work https://j.mp/wav2lip

teethdiao commented 1 year ago

The only thing we need to do for solving this problem is to open the audio.py file and modify the funciton librosa.filters.mel in line 100 by altering "hp.sample_rate, hp.n_fft" to "sr=hp.sample_rate, n_fft= hp.n_fft".

TheoTheGreat-stack commented 1 year ago

The only thing we need to do for solving this problem is to open the audio.py file and modify the funciton librosa.filters.mel in line 100 by altering "hp.sample_rate, hp.n_fft" to "sr=hp.sample_rate, n_fft= hp.n_fft".

Now it just seems to get stuck on Model loaded 10% 1/10 [09:56<1:29:24, 596.06s/it]^C

But it did make more progress with your suggested changes

TheoTheGreat-stack commented 1 year ago

Actually all seems to be good now after a few more attempts

11whitewater commented 1 year ago

pip install librosa==0.8.0 maybe it can help you

MaximillianStoner commented 1 year ago

@TheoTheGreat-stack did you do anything additional to resolve the issue? I'm stuck in the same place you were

TheoTheGreat-stack commented 1 year ago

@TheoTheGreat-stack did you do anything additional to resolve the issue? I'm stuck in the same place you were

Not really, it failed once and that's when I posted on the thread, but then tried the exact same thing again and it worked. Now it seems to be fine

dyk010518 commented 1 year ago

Hmm I am also stuck here. I have been running my model many times and it always gets stuck at the "0% 0/38 [00:00<?, ?it/s]^C"

dyk010518 commented 1 year ago

@TheoTheGreat-stack did you do anything additional to resolve the issue? I'm stuck in the same place you were

@MaximillianStoner Did you get the issue resolved by any chance?

sethmh82 commented 1 year ago

pip install librosa==0.8.0 This fixed it for me

ChiragJRana commented 1 year ago

The only thing we need to do for solving this problem is to open the audio.py file and modify the funciton librosa.filters.mel in line 100 by altering "hp.sample_rate, hp.n_fft" to "sr=hp.sample_rate, n_fft= hp.n_fft".

This worked for me. ✌️✌️

hongxin1988 commented 1 year ago

pip install librosa==0.8.0 This fixed it for me, too. thanks!

GregLomason commented 1 year ago

The only thing we need to do for solving this problem is to open the audio.py file and modify the funciton librosa.filters.mel in line 100 by altering "hp.sample_rate, hp.n_fft" to "sr=hp.sample_rate, n_fft= hp.n_fft".

once following this was hit with a "NameError: name 'hp' is not defined" I'm a complete beginner at this. Smart enough to edit the code line, ignorant enough to not know what any of the language means 😂

GregLomason commented 1 year ago

pip install librosa==0.8.0 This fixed it for me, too. thanks!

I've installed your suggestion as well as changed the code line to "hp.sample_rate, hp.n_fft" to "sr=hp.sample_rate, n_fft= hp.n_fft". Unfortunately, I'm still being prompted with the error "NameError: name 'hp' is not defined"

@teethdiao @hongxin1988

GregLomason commented 1 year ago

Reference

Building wheels for collected packages: librosa Building wheel for librosa (setup.py) ... done Created wheel for librosa: filename=librosa-0.8.0-py3-none-any.whl size=201411 sha256=f3cbd8f24227ce4bf3182db8cd2b01c264a6ee4211fa413fb2aad15a0827acb4 Stored in directory: c:\users\gregl\appdata\local\pip\cache\wheels\a4\09\cc\728ed681f0fa5c37e0fbfc66d2ba07058dd995784f1f6554a8 Successfully built librosa Installing collected packages: resampy, librosa Attempting uninstall: librosa Found existing installation: librosa 0.10.0 Uninstalling librosa-0.10.0: Successfully uninstalled librosa-0.10.0 Successfully installed librosa-0.8.0 resampy-0.4.2

(voice-clone) C:\Users\gregl\Desktop\Greg's Voice Clone>python demo_cli.py Traceback (most recent call last): File "C:\Users\gregl\Desktop\Greg's Voice Clone\demo_cli.py", line 10, in from encoder import inference as encoder File "C:\Users\gregl\Desktop\Greg's Voice Clone\encoder\inference.py", line 3, in from encoder.audio import preprocess_wav # We want to expose this function from here File "C:\Users\gregl\Desktop\Greg's Voice Clone\encoder\audio.py", line 100, in sr=hp.sample_rate, n_fft= hp.n_fft NameError: name 'hp' is not defined

RobinBrackez commented 1 year ago

Try this:

def _build_mel_basis():
    assert hp.fmax <= hp.sample_rate // 2
    return librosa.filters.mel(sr=hp.sample_rate, n_fft=hp.n_fft, n_mels=hp.num_mels,
                               fmin=hp.fmin, fmax=hp.fmax)

Wouldn't the hp error already appear on the previous line if hp is undefined?

pranavtushar commented 1 year ago

Hey with the new update of librosa, you have to specify the argument name, it will work, for sample rate, sr = hp.sample_rate, and so on.

central0658 commented 1 year ago

The only thing we need to do for solving this problem is to open the audio.py file and modify the funciton librosa.filters.mel in line 100 by altering "hp.sample_rate, hp.n_fft" to "sr=hp.sample_rate, n_fft= hp.n_fft".

I had the same issue and did this, it went on but ended in: FileNotFoundError: [Errno 2] No such file or directory: 'checkpoints/wav2lip_gan.pth'

couldn't find any fix online, any guess?

central0658 commented 1 year ago

Using cuda for inference. Reading video frames... Number of frames available for inference: 194 (80, 575) Length of mel chunks: 176 0% 0/2 [00:00<?, ?it/s] 0% 0/11 [00:12<?, ?it/s] Recovering from OOM error; New batch size: 8

0% 0/22 [00:00<?, ?it/s] 5% 1/22 [03:45<1:18:48, 225.17s/it] 9% 2/22 [03:48<31:31, 94.56s/it]
14% 3/22 [03:51<16:42, 52.78s/it] 18% 4/22 [03:54<09:56, 33.17s/it] 23% 5/22 [03:57<06:21, 22.45s/it] 27% 6/22 [04:00<04:13, 15.85s/it] 32% 7/22 [04:03<02:54, 11.66s/it] 36% 8/22 [04:07<02:04, 8.92s/it] 41% 9/22 [04:10<01:32, 7.10s/it] 45% 10/22 [04:13<01:11, 5.95s/it] 50% 11/22 [04:16<00:55, 5.07s/it] 55% 12/22 [04:19<00:44, 4.47s/it] 59% 13/22 [04:22<00:36, 4.06s/it] 64% 14/22 [04:26<00:30, 3.86s/it] 68% 15/22 [04:29<00:25, 3.63s/it] 73% 16/22 [04:32<00:20, 3.47s/it] 77% 17/22 [04:35<00:16, 3.37s/it] 82% 18/22 [04:38<00:13, 3.38s/it] 86% 19/22 [04:42<00:09, 3.30s/it] 91% 20/22 [04:45<00:06, 3.24s/it] 95% 21/22 [04:48<00:03, 3.20s/it] 100% 22/22 [04:51<00:00, 13.25s/it] Load checkpoint from: checkpoints/wav2lip_gan.pth 0% 0/2 [05:06<?, ?it/s] Traceback (most recent call last): File "/content/Wav2Lip/inference.py", line 280, in main() File "/content/Wav2Lip/inference.py", line 252, in main model = load_model(args.checkpoint_path) File "/content/Wav2Lip/inference.py", line 171, in load_model checkpoint = _load(path) File "/content/Wav2Lip/inference.py", line 162, in _load checkpoint = torch.load(checkpoint_path) File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 791, in load with _open_file_like(f, 'rb') as opened_file: File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 271, in _open_file_like return _open_file(name_or_buffer, mode) File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 252, in init super().init(open(name, mode)) FileNotFoundError: [Errno 2] No such file or directory: 'checkpoints/wav2lip_gan.pth'

G-force78 commented 1 year ago

I have this error ``` File "/content/cog-Wav2Lip/inference.py", line 15, in from batch_face import RetinaFace ModuleNotFoundError: No module named 'batch_face'

Gumballegal commented 1 year ago

going with librosa==0.8.0 gave me this:

PS C:\Wav2Lip-master> python inference.py --checkpoint_path wav2lip.pth --face video.mp4 --audio audio.wav Traceback (most recent call last): File "C:\Wav2Lip-master\inference.py", line 3, in import scipy, cv2, os, sys, argparse, audio File "C:\Wav2Lip-master\audio.py", line 1, in import librosa File "C:\Users\bruno\AppData\Local\Programs\Python\Python310\lib\site-packages\librosa__init.py", line 211, in from . import core File "C:\Users\bruno\AppData\Local\Programs\Python\Python310\lib\site-packages\librosa\core__init.py", line 9, in from .constantq import * # pylint: disable=wildcard-import File "C:\Users\bruno\AppData\Local\Programs\Python\Python310\lib\site-packages\librosa\core\constantq.py", line 1058, in dtype=np.complex, File "C:\Users\bruno\AppData\Local\Programs\Python\Python310\lib\site-packages\numpy\init.py", line 305, in getattr raise AttributeError(former_attrs[attr]) AttributeError: module 'numpy' has no attribute 'complex'. np.complex was a deprecated alias for the builtin complex. To avoid this error in existing code, use complex by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.complex128 here. The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations. Did you mean: 'complex_'?

switching between versions of numba didn't help

AIfrontier commented 1 year ago

going with librosa==0.8.0 gave me this:

PS C:\Wav2Lip-master> python inference.py --checkpoint_path wav2lip.pth --face video.mp4 --audio audio.wav Traceback (most recent call last): File "C:\Wav2Lip-master\inference.py", line 3, in import scipy, cv2, os, sys, argparse, audio File "C:\Wav2Lip-master\audio.py", line 1, in import librosa File "C:\Users\bruno\AppData\Local\Programs\Python\Python310\lib\site-packages\librosainit.py", line 211, in from . import core File "C:\Users\bruno\AppData\Local\Programs\Python\Python310\lib\site-packages\librosa\coreinit.py", line 9, in from .constantq import * # pylint: disable=wildcard-import File "C:\Users\bruno\AppData\Local\Programs\Python\Python310\lib\site-packages\librosa\core\constantq.py", line 1058, in dtype=np.complex, File "C:\Users\bruno\AppData\Local\Programs\Python\Python310\lib\site-packages\numpyinit.py", line 305, in getattr raise AttributeError(formerattrs[attr]) AttributeError: module 'numpy' has no attribute 'complex'. np.complex was a deprecated alias for the builtin complex. To avoid this error in existing code, use complex by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.complex128 here. The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations. Did you mean: 'complex'?

RobinBrackez commented 1 year ago

@Gumballegal @central0658 The checkpoints/wav2lip_gan.pth has to be downloaded separately, the links are in the README (topic: "Getting the weights"). So if the script says the file is missing, you have to download it and put it in that checkpoints-folder.

@AIfrontier The Numpy error makes me think you have the wrong version installed.

These are the versions I use in my venv. Numpy is 1.24.3

librosa               0.10.0.post2
numpy                 1.24.3
opencv-python         4.7.0.72
threadpoolctl         3.1.0
torch                 1.11.0+cu113
torchvision           0.12.0+cu113
G-force78 commented 1 year ago

This one works fine https://colab.research.google.com/drive/1OlQxvo5IX4zCpE9UWTL_vkRzpu5nqQoM?usp=sharing

PrabalS12 commented 12 months ago

This one works fine https://colab.research.google.com/drive/1OlQxvo5IX4zCpE9UWTL_vkRzpu5nqQoM?usp=sharing

gives me the same error during inference phase:

Using cuda for inference. Reading video frames... Number of frames available for inference: 400 (80, 641) Length of mel chunks: 393 0% 0/4 [00:00<?, ?it/s] 0% 0/25 [00:00<?, ?it/s]^C

Neither did the modifications in audio.py help nor did the pip install librose==0.8.0 help

Stuck here --> Using cuda for inference. Reading video frames... Number of frames available for inference: 400 (80, 641) Length of mel chunks: 393 0% 0/4 [00:00<?, ?it/s] 0% 0/25 [00:00<?, ?it/s]^C

G-force78 commented 12 months ago

Thats odd its probably the size of the image files you are using, make them square or divisible by 16

GitHub Copilot: No, not all video resolutions are divisible by 16. However, many common video resolutions are divisible by 16, such as 1080p (1920x1080), 720p (1280x720), 480p (854x480), and 360p (640x360). This is because video resolutions are often designed to have a width and height that are multiples of 16, which can make it easier to compress and decompress the video data. However, there are also many video resolutions that are not divisible by 16, such as 4K (3840x2160) and 8K (7680x4320).

gretoverno commented 11 months ago

The only thing we need to do for solving this problem is to open the audio.py file and modify the funciton librosa.filters.mel in line 100 by altering "hp.sample_rate, hp.n_fft" to "sr=hp.sample_rate, n_fft= hp.n_fft".

I think this plus just removing all the version requirements in the requirements.py file works for me

ankitvirla commented 10 months ago

pip install librosa==0.8.0 maybe it can help you

thanks, it really worked :)

Jerry-Master commented 9 months ago

for librosa==0.10.1 you can fix this error by changing line 100 of audio.py to

return librosa.filters.mel(sr=hp.sample_rate, n_fft=hp.n_fft, n_mels=hp.num_mels,
                               fmin=hp.fmin, fmax=hp.fmax)

You just need to specify the sr= and n_fft= keywords. Apparently they moved them from being positional to being keyword.

cucdengjunli commented 8 months ago

pip install librosa==0.8.0 good!

kapilaGIT commented 5 months ago

The following solved the same problem I also encountered.

pip install librosa==0.8.0