Closed atillayurtseven closed 3 years ago
Looks like it could be an issue with anaconda (similar to: https://github.com/dereneaton/ipyrad/issues/180). Try setting up a clean environment with miniconda or pip.
If using Linux you might find https://github.com/CorentinJ/Real-Time-Voice-Cloning/issues/615#issuecomment-740270185 or the wiki setup instructions helpful.
That was librosa. I have just updated librosa and the error is gone. This may help others as well.. However now i am getting the following error and trying to fix it.
[3] Call to cuInit results in CUDA_ERROR_NOT_INITIALIZED:
I'm running on MAC and no GPU here.
I ran the toolbox. After recording, getting the following error: Exception: Audio buffer is not finite everywhere
It's working for me on Intel Mac, no GPU. Just did a fresh toolbox install in Python 3.7.9 with pristine venv.
conda list
and pip freeze
would be helpful to know which package versions you are running.I use built-in one
I have the following librosa librosa 0.8.0
Can you please tell me what command did you execute?
To launch the toolbox? Let me know if you meant installation.
python demo_toolbox.py
yes to launch the toolbox. I launched toolbox. When i browse and select from samples, it works but when i try to record i am getting the error i mention earlier.
What is the error message and traceback in the terminal when you try to record?
I am getting the error message when i try to record my voice. Exception: Audio buffer is not finite everywhere I have recorded using a different application this time it works. But i am not sure if i am doing the correct thing. Voice is too noisy most of the time and when i change the text, do i have to click Synthesize and vocode and wait each time?
Yes, whenever text is updated you need to synthesize and vocode. You can use Griffin-Lim vocoder to check the quality before running the pretrained WaveRNN, which is slow on CPU.
The pretrained model is quite bad. I have a method in #437 to improve quality, but it involves recording a dataset.
Oh i see. can onl see Griffin-Lim in the list btw. Actually, this is an amazing job 👍
You can enable the WaveRNN vocoder by copying vocoder/saved_models/pretrained/pretrained.pt
from the zip file to that same location in your repo. It will show up in the vocoder list as "pretrained".
Yes i have it there and i can see pretrained in the list. I tought it was called something else.
Really amazing job. :)
Thanks for the feedback. Corentin did a really good job making neural TTS a lot more accessible to the general public. The GUI and command-line utilities make it very user-friendly.
I'm glad that you appreciate it for what it is. Most people have very high expectations. The output will not fool a human, but is still impressive given it can be trained from scratch on consumer hardware, using free datasets, in a matter of days.
What amazes me is the idea and the technology. I really don't care about the quality 👍
The researchers who came up with this impressive concept also had very good execution and results: https://google.github.io/tacotron/publications/speaker_adaptation/index.html
Unfortunately the Google implementation is not open-source so we are still trying to replicate the results of that paper. My best results so far: https://github.com/CorentinJ/Real-Time-Voice-Cloning/issues/624#issuecomment-756020854
I couldn't run. Having the following error: AttributeError: module 'numba' has no attribute 'jit'