Traceback (most recent call last):
File "c:\Users----\Downloads\Real-Time-Voice-Cloning-master\demo_cli.py", line 80, in
encoder.embed_utterance(np.zeros(encoder.sampling_rate))
File "c:\Users----\Downloads\Real-Time-Voice-Cloning-master\encoder\inference.py", line 144, in embed_utterance
frames = audio.wav_to_mel_spectrogram(wav)
File "c:\Users----\Downloads\Real-Time-Voice-Cloning-master\encoder\audio.py", line 58, in wav_to_mel_spectrogram
frames = librosa.feature.melspectrogram(
TypeError: melspectrogram() takes 0 positional arguments but 2 positional arguments (and 2 keyword-only arguments) were given
Here is the code that the error occurred on:
def wav_to_mel_spectrogram(wav):
"""
Derives a mel spectrogram ready to be used by the encoder from a preprocessed audio waveform.
Note: this not a log-mel spectrogram.
"""
frames = librosa.feature.melspectrogram(
wav,
sampling_rate,
n_fft=int(sampling_rate mel_window_length / 1000),
hop_length=int(sampling_rate mel_window_step / 1000),
n_mels=mel_n_channels
)
return frames.astype(np.float32).T
Update: I believe the issue is that I am on the wrong version of librosa, would anyone know the version used here?
Traceback (most recent call last): File "c:\Users----\Downloads\Real-Time-Voice-Cloning-master\demo_cli.py", line 80, in
encoder.embed_utterance(np.zeros(encoder.sampling_rate))
File "c:\Users----\Downloads\Real-Time-Voice-Cloning-master\encoder\inference.py", line 144, in embed_utterance
frames = audio.wav_to_mel_spectrogram(wav)
File "c:\Users----\Downloads\Real-Time-Voice-Cloning-master\encoder\audio.py", line 58, in wav_to_mel_spectrogram
frames = librosa.feature.melspectrogram(
TypeError: melspectrogram() takes 0 positional arguments but 2 positional arguments (and 2 keyword-only arguments) were given
Here is the code that the error occurred on:
def wav_to_mel_spectrogram(wav): """ Derives a mel spectrogram ready to be used by the encoder from a preprocessed audio waveform. Note: this not a log-mel spectrogram. """ frames = librosa.feature.melspectrogram( wav, sampling_rate, n_fft=int(sampling_rate mel_window_length / 1000), hop_length=int(sampling_rate mel_window_step / 1000), n_mels=mel_n_channels ) return frames.astype(np.float32).T
Update: I believe the issue is that I am on the wrong version of librosa, would anyone know the version used here?