CorentinJ / Real-Time-Voice-Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time
Other
52.87k stars 8.81k forks source link

AttributeError: module 'numba' has no attribute 'jit' #625

Closed atillayurtseven closed 3 years ago

atillayurtseven commented 3 years ago

I couldn't run. Having the following error: AttributeError: module 'numba' has no attribute 'jit'

ghost commented 3 years ago

Looks like it could be an issue with anaconda (similar to: https://github.com/dereneaton/ipyrad/issues/180). Try setting up a clean environment with miniconda or pip.

If using Linux you might find https://github.com/CorentinJ/Real-Time-Voice-Cloning/issues/615#issuecomment-740270185 or the wiki setup instructions helpful.

atillayurtseven commented 3 years ago

That was librosa. I have just updated librosa and the error is gone. This may help others as well.. However now i am getting the following error and trying to fix it.

[3] Call to cuInit results in CUDA_ERROR_NOT_INITIALIZED:

I'm running on MAC and no GPU here.

atillayurtseven commented 3 years ago

I ran the toolbox. After recording, getting the following error: Exception: Audio buffer is not finite everywhere

ghost commented 3 years ago

It's working for me on Intel Mac, no GPU. Just did a fresh toolbox install in Python 3.7.9 with pristine venv.

atillayurtseven commented 3 years ago

I use built-in one

atillayurtseven commented 3 years ago

I have the following librosa librosa 0.8.0

atillayurtseven commented 3 years ago

Can you please tell me what command did you execute?

ghost commented 3 years ago

To launch the toolbox? Let me know if you meant installation.

python demo_toolbox.py
atillayurtseven commented 3 years ago

yes to launch the toolbox. I launched toolbox. When i browse and select from samples, it works but when i try to record i am getting the error i mention earlier.

ghost commented 3 years ago

What is the error message and traceback in the terminal when you try to record?

atillayurtseven commented 3 years ago

I am getting the error message when i try to record my voice. Exception: Audio buffer is not finite everywhere I have recorded using a different application this time it works. But i am not sure if i am doing the correct thing. Voice is too noisy most of the time and when i change the text, do i have to click Synthesize and vocode and wait each time?

ghost commented 3 years ago

Yes, whenever text is updated you need to synthesize and vocode. You can use Griffin-Lim vocoder to check the quality before running the pretrained WaveRNN, which is slow on CPU.

The pretrained model is quite bad. I have a method in #437 to improve quality, but it involves recording a dataset.

atillayurtseven commented 3 years ago

Oh i see. can onl see Griffin-Lim in the list btw. Actually, this is an amazing job 👍

ghost commented 3 years ago

You can enable the WaveRNN vocoder by copying vocoder/saved_models/pretrained/pretrained.pt from the zip file to that same location in your repo. It will show up in the vocoder list as "pretrained".

atillayurtseven commented 3 years ago

Yes i have it there and i can see pretrained in the list. I tought it was called something else.

Really amazing job. :)

ghost commented 3 years ago

Thanks for the feedback. Corentin did a really good job making neural TTS a lot more accessible to the general public. The GUI and command-line utilities make it very user-friendly.

I'm glad that you appreciate it for what it is. Most people have very high expectations. The output will not fool a human, but is still impressive given it can be trained from scratch on consumer hardware, using free datasets, in a matter of days.

atillayurtseven commented 3 years ago

What amazes me is the idea and the technology. I really don't care about the quality 👍

ghost commented 3 years ago

The researchers who came up with this impressive concept also had very good execution and results: https://google.github.io/tacotron/publications/speaker_adaptation/index.html

Unfortunately the Google implementation is not open-source so we are still trying to replicate the results of that paper. My best results so far: https://github.com/CorentinJ/Real-Time-Voice-Cloning/issues/624#issuecomment-756020854