coqui-ai / TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
http://coqui.ai
Mozilla Public License 2.0
35.09k stars 4.28k forks source link

--speaker_wav leads to AttributeError: 'NoneType' object has no attribute 'load_wav' #465

Closed 54696d21 closed 3 years ago

54696d21 commented 3 years ago

input:

tts --text 'Hello world!'  --out_path out/out_1.wav --model_name tts_models/en/vctk/sc-glow-tts --vocoder_name vocoder_models/en/vctk/hifigan_v2 --speaker_wav 28.wav

output/error

 > tts_models/en/vctk/sc-glow-tts is already downloaded.
 > vocoder_models/en/vctk/hifigan_v2 is already downloaded.
Loading speakers ...
 > Using model: glow_tts
 > Generator Model: hifigan_generator
Removing weight norm...
 > Text: Hello world!
 > Text splitted to sentences.
['Hello world!']
Traceback (most recent call last):
  File "/home/user/.local/bin/tts", line 8, in <module>
    sys.exit(main())
  File "/home/user/.local/lib/python3.7/site-packages/TTS/bin/synthesize.py", line 257, in main
    wav = synthesizer.tts(args.text, args.speaker_idx, args.speaker_wav)
  File "/home/user/.local/lib/python3.7/site-packages/TTS/utils/synthesizer.py", line 220, in tts
    speaker_embedding = self.speaker_manager.compute_x_vector_from_clip(speaker_wav)
  File "/home/user/.local/lib/python3.7/site-packages/TTS/tts/utils/speakers.py", line 241, in compute_x_vector_from_clip
    x_vector = _compute(wf)
  File "/home/user/.local/lib/python3.7/site-packages/TTS/tts/utils/speakers.py", line 228, in _compute
    waveform = self.speaker_encoder_ap.load_wav(wav_file, sr=self.speaker_encoder_ap.sample_rate)
AttributeError: 'NoneType' object has no attribute 'load_wav'```

when reaching this line: ```
waveform = self.speaker_encoder_ap.load_wav(wav_file, sr=self.speaker_encoder_ap.sample_rate)

self.speaker_encoder_ap is a NoneType for me, so it seems that self.speaker_encoder_ap wasn't initialized

the wav file im supplying is a 22050 mono file and it's path is correct

i'm running version 0.13

this works without a problem:

tts --text 'Hello world!'  --out_path out/out21.wav --model_name tts_models/en/vctk/sc-glow-tts --vocoder_name vocoder_models/en/vctk/hifigan_v2 --speaker_idx p245
54696d21 commented 3 years ago

Ok, the problem seems to exist because I don't supply and --encoder_config_path and --encoder_path Unfortunately I haven't found out yet how to get this encoder

54696d21 commented 3 years ago

ok, I figured it out

for everyone else having this problem: the encoder can be downloaded here: https://github.com/Edresson/SC-GlowTTS

tts --text 'Hello world!'  --out_path out/out_1.wav --model_name tts_models/en/vctk/sc-glow-tts --vocoder_name vocoder_models/en/vctk/hifigan_v2 --speaker_wav 28.wav --encoder_config_path /home/user/TTS-de_v1/0.13/encoder/drive-download-20210430T024321Z-001/config.json --encoder_path /home/user/TTS-de_v1/0.13/encoder/drive-download-20210430T024321Z-001/checkpoint.pth.tar

that issue is resolved for me, but can please the error be something like "no encoder path and encoder config specified" for this error

54696d21 commented 3 years ago

on further inspection it seems to me that i'm poking around at a part of this package that wasn't meant to be used already so I'm closing this issue :)

huge props to the authors for putting putting this implementation out in the same month the preprint is published