CorentinJ / Real-Time-Voice-Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time
Other
51.49k stars 8.64k forks source link

TypeError: melspectrogram() #1280

Closed stevens-Ai closed 6 months ago

stevens-Ai commented 6 months ago

Getting the mentioned error when running demo_cli.py (windows 10)

D:\Ai_audio\Real-Time-Voice-Cloning>python demo_cli.py
C:\Program Files\Python310\lib\site-packages\numpy\_distributor_init.py:30: UserWarning: loaded more than 1 DLL from .libs:
C:\Program Files\Python310\lib\site-packages\numpy\.libs\libopenblas.EL2C6PLE4ZYW3ECEVIV3OXXGRN2NRFM2.gfortran-win_amd64.dll
C:\Program Files\Python310\lib\site-packages\numpy\.libs\libopenblas64__v0.3.23-gcc_10_3_0.dll
  warnings.warn("loaded more than 1 DLL from .libs:"
Arguments:
    enc_model_fpath:   saved_models\default\encoder.pt
    syn_model_fpath:   saved_models\default\synthesizer.pt
    voc_model_fpath:   saved_models\default\vocoder.pt
    cpu:               False
    no_sound:          False
    seed:              None

Running a test of your configuration...

Found 1 GPUs available. Using GPU 0 (NVIDIA GeForce RTX 3060) of compute capability 8.6 with 12.9Gb total memory.

Preparing the encoder, the synthesizer and the vocoder...
Loaded encoder "encoder.pt" trained to step 1564501
Synthesizer using device: cuda
Building Wave-RNN
Trainable Parameters: 4.481M
Loading model weights at saved_models\default\vocoder.pt
Testing your configuration with small inputs.
        Testing the encoder...
Traceback (most recent call last):
  File "D:\Ai_audio\Real-Time-Voice-Cloning\demo_cli.py", line 80, in <module>
    encoder.embed_utterance(np.zeros(encoder.sampling_rate))
  File "D:\Ai_audio\Real-Time-Voice-Cloning\encoder\inference.py", line 144, in embed_utterance
    frames = audio.wav_to_mel_spectrogram(wav)
  File "D:\Ai_audio\Real-Time-Voice-Cloning\encoder\audio.py", line 58, in wav_to_mel_spectrogram
    frames = librosa.feature.melspectrogram(
TypeError: melspectrogram() takes 0 positional arguments but 2 positional arguments (and 2 keyword-only arguments) were given
stevens-Ai commented 6 months ago

The solution is here, sorry [(https://github.com/CorentinJ/Real-Time-Voice-Cloning/issues/1176)]